Skip to main content
Log in

Bi-GAE: A Bidirectional Generative Auto-Encoder

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Improving the generative and representational capabilities of auto-encoders is a hot research topic. However, it is a challenge to jointly and simultaneously optimize the bidirectional mapping between the encoder and the decoder/generator while ensuing convergence. Most existing auto-encoders cannot automatically trade off bidirectional mapping. In this work, we propose Bi-GAE, an unsupervised bidirectional generative auto-encoder based on bidirectional generative adversarial network (BiGAN). First, we introduce two terms that enhance information expansion in decoding to follow human visual models and to improve semantic-relevant feature representation capability in encoding. Furthermore, we embed a generative adversarial network (GAN) to improve representation while ensuring convergence. The experimental results show that Bi-GAE achieves competitive results in both generation and representation with stable convergence. Compared with its counterparts, the representational power of Bi-GAE improves the classification accuracy of high-resolution images by about 8.09%. In addition, Bi-GAE increases structural similarity index measure (SSIM) by 0.045, and decreases Fréchet inception distance (FID) by 2.48 in the reconstruction of 512 × 512 images.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

References

  1. Liu W B, Wang Z D, Liu X H, Zeng N Y, Liu Y R, Alsaadi F E. A survey of deep neural network architectures and their applications. Neurocomputing, 2017, 234: 11–26. https://doi.org/10.1016/j.neucom.2016.12.038.

    Article  Google Scholar 

  2. Zhu J Y, Park T, Isola P, Efros A A. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proc. the 2017 IEEE International Conference on Computer Vision (ICCV), Oct. 2017, pp.2242–2251. https://doi.org/10.1109/ICCV.2017.244.

  3. Tewari A, Zollhöfer M, Kim H, Garrido P, Bernard F, Pérez P, Theobalt C. MoFA: Model-based deep convolutional face autoencoder for unsupervised monocular reconstruction. In Proc. the 2017 IEEE International Conference on Computer Vision (ICCV), Oct. 2017, pp.3735–3744. https://doi.org/10.1109/ICCV.2017.401.

  4. Li X P, She J. Collaborative variational autoencoder for recommender systems. In Proc. the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Aug. 2017, pp.305–314. https://doi.org/10.1145/3097983.3098077.

  5. Zhou C, Paffenroth R C. Anomaly detection with robust deep autoencoders. In Proc. the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Aug. 2017, pp.665–674. https://doi.org/10.1145/3097983.3098052.

  6. Doersch C. Tutorial on variational autoencoders. arXiv: 1606.05908, 2016. https://doi.org/10.48550/arXiv.1606.05908, May 2023.

  7. Creswell A, White T, Dumoulin V, Arulkumaran K, Sengupta B, Bharath A A. Generative adversarial networks: An overview. IEEE Signal Processing Magazine, 2018, 35(1): 53–65. https://doi.org/10.1109/MSP.2017.2765202.

    Article  Google Scholar 

  8. Karras T, Laine S, Aila T. A style-based generator architecture for generative adversarial networks. IEEE Trans. Pattern Analysis and Machine Intelligence, 2021, 43(12): 4217–4228. https://doi.org/10.1109/TPAMI.2020.2970919.

  9. Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V, Courville A. Improved training of Wasserstein GANs. In Proc. the 31st International Conference on Neural Information Processing Systems, Dec. 2017, pp.5769–5779. https://doi.org/10.5555/3295222.3295327.

  10. Mao X D, Li Q, Xie H R, Lau R Y K, Wang Z, Smolley S P. Least squares generative adversarial networks. In Proc. the 2017 IEEE International Conference on Computer Vision (ICCV), Oct. 2017, pp.2813–2821. https://doi.org/10.1109/ICCV.2017.304.

  11. Donahue J, Krähenbühl P, Darrell T. Adversarial feature learning. arXiv: 1605.09782, 2016. https://doi.org/10.48550/arXiv.1605.09782, May 2023.

  12. Radford A, Metz L, Chintala S. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv: 1511.06434, 2015. https://doi.org/10.48550/arXiv.1511.06434, May 2023.

  13. Chen X, Duan Y, Houthooft R, Schulman J, Sutskever I, Abbeel P. InfoGAN: Interpretable representation learning by information maximizing generative adversarial nets. In Proc. the 30th International Conference on Neural Information Processing Systems, Dec. 2016, pp.2180–2188. https://doi.org/10.5555/3157096.3157340.

  14. Makhzani A, Shlens J, Jaitly N, Goodfellow I, Frey B. Adversarial autoencoders. arXiv: 1511.05644, 2015. https://doi.org/10.48550/arXiv.1511.05644, May 2023.

  15. Pidhorskyi S, Adjeroh D A, Doretto G. Adversarial latent autoencoders. In Proc. the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2020, pp.14092–14101. https://doi.org/10.1109/CVPR-42600.2020.01411.

  16. Karras T, Aila T, Laine S, Lehtinen J. Progressive growing of GANs for improved quality, stability, and variation. arXiv: 1710.10196, 2017. https://doi.org/10.48550/arXiv.1710.10196, May 2023.

  17. Li C L, Chang W C, Cheng Y, Yang Y M, Póczos B. MMD GAN: Towards deeper understanding of moment matching network. In Proc. the 31st International Conference on Neural Information Processing Systems, Dec. 2017, pp.2200–2210.

  18. Arjovsky M, Chintala S, Bottou L. Wasserstein generative adversarial networks. In Proc. the 34th International Conference on Machine Learning, Aug. 2017, pp.214–223. https://doi.org/10.5555/3305381.3305404.

  19. Wang Z W, She Q, Ward T E. Generative adversarial networks in computer vision: A survey and taxonomy. ACM Computing Surveys, 2021, 54(2): Article No. 37. https://doi.org/10.1145/3439723.

  20. Pan Z Q, Yu W J, Yi X K, Khan A, Yuan F, Zheng Y H. Recent progress on generative adversarial networks (GANs): A survey. IEEE Access, 2019, 7: 36322–36333. https://doi.org/10.1109/ACCESS.2019.2905015.

    Article  Google Scholar 

  21. Johnson J, Alahi A, Li F F. Perceptual losses for real-time style transfer and super-resolution. In Proc. the 14th European Conference on Computer Vision, Oct. 2016, pp.694–711. https://doi.org/10.1007/978-3-319-46475-6_43.

  22. Berthelot D, Schumm T, Metz L. BEGAN: Boundary equilibrium generative adversarial networks. arXiv: 1703.10717, 2017. https://doi.org/10.48550/arXiv.1703.10717, May 2023.

  23. Wang T C, Liu M Y, Zhu J Y, Tao A, Kautz J, Catanzaro B. High-resolution image synthesis and semantic manipulation with conditional GANs. In Proc. the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2018, pp.8798–8807. https://doi.org/10.1109/CVPR.2018.00917.

  24. Zhang H, Xu T, Li H S, Zhang S T, Wang X G, Huang X L, Metaxas D. StackGAN: Text to photo-realistic image synthesis with stacked generative adversarial networks. In Proc. the 2017 IEEE International Conference on Computer Vision, Oct. 2017, pp.5908–5916. https://doi.org/10.1109/ICCV.2017.629.

  25. Karras T, Laine S, Aittala M, Hellsten J, Lehtinen J, Aila T. Analyzing and improving the image quality of Style-GAN. In Proc. the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2020, pp.8107–8116. https://doi.org/10.1109/CVPR42600.2020.00813.

  26. Rezende D J, Mohamed S, Wierstra D. Stochastic back-propagation and approximate inference in deep generative models. In Proc. the 31st International Conference on International Conference on Machine Learning, Jun. 2014, pp.1278–1286. https://doi.org/10.5555/3044805.3045035.

  27. Chen R T Q, Li X C, Grosse R, Duvenaud D. Isolating sources of disentanglement in VAEs. In Proc. the 32nd International Conference on Neural Information Processing Systems, Dec. 2018, pp.2615–2625. https://doi.org/10.5555/3327144.3327186.

  28. Roy A, Grangier D. Unsupervised paraphrasing without translation. arXiv: 1905.12752, 2019. https://doi.org/10.48550/arXiv.1905.12752, May 2023.

  29. Kingma D P, Salimans T, Jozefowicz R, Chen X, Sutskever I, Welling M. Improved variational inference with inverse autoregressive flow. In Proc. the 30th International Conference on Neural Information Processing Systems (NIPS), Dec. 2016, pp.4743–4751. https://doi.org/10.5555/3157382.3157627.

  30. Huang H B, Li Z H, He R, Sun Z N, Tan T N. IntroVAE: Introspective variational autoencoders for photographic image synthesis. In Proc. the 32nd International Conference on Neural Information Processing Systems, Dec. 2018, pp.52–63. https://doi.org/10.5555/3326943.3326949.

  31. Su J L. GAN-QP: A novel GAN framework without gradient vanishing and lipschitz constraint. arXiv: 1811. 07296, 2018. https://doi.org/10.48550/arXiv.1811.07296, May 2023.

  32. Arora S, Ge R, Liang Y Y, Ma T Y, Zhang Y. Generalization and equilibrium in generative adversarial nets (GANs). In Proc. the 34th International Conference on Machine Learning, Aug. 2017, pp.224–232. https://doi.org/10.1145/3188745.3232194.

  33. Wang W, Sun Y, Halgamuge S. Improving MMD-GAN training with repulsive loss function. arXiv: 1812.09916, 2018. https://doi.org/10.48550/arXiv.1812.09916, May 2023.

  34. Gatys L A, Ecker A S, Bethge M. Image style transfer using convolutional neural networks. In Proc. the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2016, pp.2414–2423. https://doi.org/10.1109/CVPR.2016.265.

  35. Liu Y F, Chen H, Chen Y, Yin W, Shen C H. Generic perceptual loss for modeling structured output dependencies. In Proc. the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2021, pp.5420–5428. https://doi.org/10.1109/CVPR46437.2021.00538.

  36. He K M, Zhang X Y, Ren S Q, Sun J. Delving deep into rectifiers: Surpassing human-level performance on imageNet classification. In Proc. the 2015 IEEE International Conference on Computer Vision (ICCV), Dec. 2015, pp.1026–1034. https://doi.org/10.1109/ICCV.2015.123.

  37. Zhao H, Gallo O, Frosio I, Kautz J. Loss functions for image restoration with neural networks. IEEE Trans. Computational Imaging, 2017, 3(1): 47–57. https://doi.org/10.1109/TCI.2016.2644865.

    Article  Google Scholar 

  38. Deng L. The MNIST database of handwritten digit images for machine learning research [best of the web]. IEEE Signal Processing Magazine, 2012, 29(6): 141–142. https://doi.org/10.1109/MSP.2012.2211477.

    Article  Google Scholar 

  39. Hearst M A, Dumais S T, Osuna E, Platt J, Scholkopf B. Support vector machines. IEEE Intelligent Systems and their Applications, 1998, 13(4): 18–28. https://doi.org/10.1109/5254.708428.

    Article  Google Scholar 

  40. Ye J P. Least squares linear discriminant analysis. In Proc. the 24th International Conference on Machine Learning, Jun. 2007, pp.1087–1093. https://doi.org/10.1145/1273496.1273633.

  41. Rigatti S J. Random forest. Journal of Insurance Medicine, 2017, 47(1): 31–39. https://doi.org/10.17849/insm-47-01-31-39.1.

    Article  Google Scholar 

  42. Hastie T, Rosset S, Zhu J, Zou H. Multi-class AdaBoost. Statistics and Its Interface, 2009, 2(3): 349–360. https://doi.org/10.4310/SII.2009.v2.n3.a8.

    Article  MathSciNet  MATH  Google Scholar 

  43. Zhang H, Goodfellow I, Metaxas D, Odena A. Self-attention generative adversarial networks. arXiv: 1805.08318, 2018. https://doi.org/10.48550/arXiv.1805.08318, May 2023.

  44. Liu Z W, Luo P, Wang X G, Tang X O. Deep learning face attributes in the wild. In Proc. the 2015 IEEE International Conference on Computer Vision (ICCV), Dec. 2015, pp.3730–3738. https://doi.org/10.1109/ICCV.2015.425.

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Shi-You Qian or Ding-Yu Yang.

Additional information

This paper is the result of collaboration between Shanghai Jiao Tong University and Alibaba Group. Shi-You Qian is an associate researcher in Shanghai Jiao Tong University and Ding-Yu Yang is a senior engineer at Alibaba Group. Both of them can always assume the responsibilities and obligations of the corresponding author.

Supplementary Information

ESM 1

(PDF 159 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hua, Q., Hu, HW., Qian, SY. et al. Bi-GAE: A Bidirectional Generative Auto-Encoder. J. Comput. Sci. Technol. 38, 626–643 (2023). https://doi.org/10.1007/s11390-023-1902-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-023-1902-1

Keywords

Navigation