skip to main content
research-article

Equivariant Adversarial Network for Image-to-image Translation

Authors Info & Claims
Published:14 June 2021Publication History
Skip Abstract Section

Abstract

Image-to-Image translation aims to learn an image from a source domain to a target domain. However, there are three main challenges, such as lack of paired datasets, multimodality, and diversity, that are associated with these problems and need to be dealt with. Convolutional neural networks (CNNs), despite of having great performance in many computer vision tasks, they fail to detect the hierarchy of spatial relationships between different parts of an object and thus do not form the ideal representative model we look for. This article presents a new variation of generative models that aims to remedy this problem. We use a trainable transformer, which explicitly allows the spatial manipulation of data within training. This differentiable module can be augmented into the convolutional layers in the generative model, and it allows to freely alter the generated distributions for image-to-image translation. To reap the benefits of proposed module into generative model, our architecture incorporates a new loss function to facilitate an effective end-to-end generative learning for image-to-image translation. The proposed model is evaluated through comprehensive experiments on image synthesizing and image-to-image translation, along with comparisons with several state-of-the-art algorithms.

References

  1. Matthew Amodio and Smita Krishnaswamy. 2019. TravelGAN: Image-to-image translation by transformation vector learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8983–8992.Google ScholarGoogle ScholarCross RefCross Ref
  2. Yogesh Balaji, Hamed Hassani, Rama Chellappa, and Soheil Feizi. 2018. Entropic GANs meet VAEs: A statistical approach to compute sample likelihoods in GANs. arXiv preprint arXiv:1810.04147 (2018).Google ScholarGoogle Scholar
  3. Cher Bass, Tianhong Dai, Benjamin Billot, Kai Arulkumaran, Antonia Creswell, Claudia Clopath, Vincenzo De Paola, and Anil Anthony Bharath. 2019. Image synthesis with a convolutional capsule generative adversarial network. In International Conference on Medical Imaging with Deep Learning. PMLR, 39–62.Google ScholarGoogle Scholar
  4. Matan Ben-Yosef and Daphna Weinshall. 2018. Gaussian mixture generative adversarial networks for diverse datasets, and the unsupervised clustering of images. arXiv preprint arXiv:1808.10356 (2018).Google ScholarGoogle Scholar
  5. Charlotte Bunne, David Alvarez-Melis, Andreas Krause, and Stefanie Jegelka. 2019. Learning generative models across incomparable spaces. arXiv preprint arXiv:1905.05461 (2019).Google ScholarGoogle Scholar
  6. Huiwen Chang, Jingwan Lu, Fisher Yu, and Adam Finkelstein. 2018. PairedCycleGAN: Asymmetric style transfer for applying and removing makeup. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 40–48.Google ScholarGoogle ScholarCross RefCross Ref
  7. William Fedus, Mihaela Rosca, Balaji Lakshminarayanan, Andrew M. Dai, Shakir Mohamed, and Ian Goodfellow. 2017. Many paths to equilibrium: GANs do not need to decrease a divergence at every step. arXiv preprint arXiv:1710.08446 (2017).Google ScholarGoogle Scholar
  8. Aude Genevay, Gabriel Peyré, and Marco Cuturi. 2017. Learning generative models with Sinkhorn divergences. arXiv preprint arXiv:1706.00292 (2017).Google ScholarGoogle Scholar
  9. Abel Gonzalez-Garcia, Joost Van De Weijer, and Yoshua Bengio. 2018. Image-to-image translation for cross-domain disentanglement. In Proceedings of the International Conference on Advances in Neural Information Processing Systems.1287–1298. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Ian Goodfellow. 2016. NIPS 2016 tutorial: Generative adversarial networks. arXiv preprint arXiv:1701.00160 (2016).Google ScholarGoogle Scholar
  11. Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, and Aaron C. Courville. 2017. Improved training of Wasserstein GANs. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 5767–5777. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Uiwon Hwang, Dahuin Jung, and Sungroh Yoon. 2019. HexaGAN: Generative adversarial nets for real world classification. arXiv preprint arXiv:1902.09913 (2019).Google ScholarGoogle Scholar
  13. Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. 2017. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1125–1134.Google ScholarGoogle Scholar
  14. Max Jaderberg, Karen Simonyan, Andrew Zisserman, and Koray Kavukcuoglu. 2015. Spatial transformer networks. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 2017–2025. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. 2017. Progressive growing of gans for improved quality, stability, and variation. In Proceedings of the International Conference on Learning Representations.Google ScholarGoogle Scholar
  16. Taeksoo Kim, Moonsu Cha, Hyunsoo Kim, Jung Kwon Lee, and Jiwon Kim. 2017. Learning to discover cross-domain relations with generative adversarial networks. In Proceedings of the 34th International Conference on Machine Learning. JMLR.org, 1857–1865. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google ScholarGoogle Scholar
  18. Alex Krizhevsky, Geoffrey Hinton, et al. 2009. Learning multiple layers of features from tiny images. Master's thesis. Department of Computer Science, University of Toronto.Google ScholarGoogle Scholar
  19. Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278–2324.Google ScholarGoogle ScholarCross RefCross Ref
  20. Christian Ledig, Lucas Theis, Ferenc Huszár, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, et al. 2017. Photo-realistic single image super-resolution using a generative adversarial network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4681–4690.Google ScholarGoogle ScholarCross RefCross Ref
  21. Hsin-Ying Lee, Hung-Yu Tseng, Jia-Bin Huang, Maneesh Singh, and Ming-Hsuan Yang. 2018. Diverse image-to-image translation via disentangled representations. In Proceedings of the European Conference on Computer Vision (ECCV). 35–51.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. 2015. Deep learning face attributes in the wild. In Proceedings of the IEEE International Conference on Computer Vision. 3730–3738. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3431–3440.Google ScholarGoogle ScholarCross RefCross Ref
  24. Pauline Luc, Camille Couprie, Soumith Chintala, and Jakob Verbeek. 2016. Semantic segmentation using adversarial networks. arXiv preprint arXiv:1611.08408 (2016).Google ScholarGoogle Scholar
  25. Xudong Mao, Qing Li, Haoran Xie, Raymond Y. K. Lau, Zhen Wang, and Stephen Paul Smolley. 2018. On the effectiveness of least squares generative adversarial networks. IEEE Trans. Pattern Anal. Mach. Intell. 41, 12 (2018), 2947–2960.Google ScholarGoogle ScholarCross RefCross Ref
  26. Youssef Alami Mejjati, Christian Richardt, James Tompkin, Darren Cosker, and Kwang In Kim. 2018. Unsupervised attention-guided image-to-image translation. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 3693–3703. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y. Ng. 2011. Reading digits in natural images with unsupervised feature learning. (2011).Google ScholarGoogle Scholar
  28. Augustus Odena, Christopher Olah, and Jonathon Shlens. 2017. Conditional image synthesis with auxiliary classifier gans. In Proceedings of the 34th International Conference on Machine Learning. JMLR.org, 2642–2651. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Gabriel Peyré, Marco Cuturi, and Justin Solomon. 2016. Gromov-Wasserstein averaging of kernel and distance matrices. In Proceedings of the International Conference on Machine Learning. 2664–2672. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Alec Radford, Luke Metz, and Soumith Chintala. 2015. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015).Google ScholarGoogle Scholar
  31. Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. 2016. Improved techniques for training GANs. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 2234–2242. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Tim Salimans, Han Zhang, Alec Radford, and Dimitris Metaxas. 2018. Improving GANs using optimal transport. arXiv preprint arXiv:1803.05573 (2018).Google ScholarGoogle Scholar
  33. Pourya Shamsolmoali, Masoumeh Zareapoor, Eric Granger, Huiyu Zhou, Ruili Wang, M Emre Celebi, and Jie Yang. 2021. Image synthesis with adversarial networks: A comprehensive survey and case studies. Inf. Fus. (2021).Google ScholarGoogle Scholar
  34. Pourya Shamsolmoali, Masoumeh Zareapoor, Linlin Shen, Abdul Hamid Sadka, and Jie Yang. 2020. Imbalanced data learning by minority class augmentation using capsule adversarial networks. Neurocomputing (2020).Google ScholarGoogle Scholar
  35. Pourya Shamsolmoali, Masoumeh Zareapoor, Ruili Wang, Deepak Kumar Jain, and Jie Yang. 2019. G-GANISR: Gradual generative adversarial network for image super resolution. Neurocomputing 366 (2019), 140–153.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Pourya Shamsolmoali, Masoumeh Zareapoor, Huiyu Zhou, and Jie Yang. 2020. AMIL: Adversarial multi-instance learning for human pose estimation. ACM Trans. Multim. Comput. Commun. Applic. 16, 1s (2020), 1–23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Zhengwei Wang, Qi She, and Tomas E. Ward. 2019. Generative adversarial networks in computer vision: A survey and taxonomy. arXiv preprint arXiv:1906.01529 (2019). Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Jerry Wei, Arief Suriawinata, Louis Vaickus, Bing Ren, Xiaoying Liu, Jason Wei, and Saeed Hassanpour. 2019. Generative image translation for data augmentation in colorectal histopathology images. arXiv preprint arXiv:1910.05827 (2019).Google ScholarGoogle Scholar
  39. Karren D. Yang and Caroline Uhler. 2018. Scalable unbalanced optimal transport using generative adversarial networks. arXiv preprint arXiv:1810.11447 (2018).Google ScholarGoogle Scholar
  40. Zichao Yang, Xiaodong He, Jianfeng Gao, Li Deng, and Alex Smola. 2016. Stacked attention networks for image question answering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 21–29.Google ScholarGoogle ScholarCross RefCross Ref
  41. Zili Yi, Hao Zhang, Ping Tan, and Minglun Gong. 2017. DualGAN: Unsupervised dual learning for image-to-image translation. In Proceedings of the IEEE International Conference on Computer Vision. 2849–2857.Google ScholarGoogle ScholarCross RefCross Ref
  42. Masoumeh Zareapoor, Pourya Shamsolmoali, and Jie Yang. 2021. Oversampling adversarial network for class-imbalanced fault diagnosis. Mech. Syst. Sig. Proc. 149 (2021), 107175.Google ScholarGoogle ScholarCross RefCross Ref
  43. Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision. 2223–2232.Google ScholarGoogle Scholar
  44. Jun-Yan Zhu, Richard Zhang, Deepak Pathak, Trevor Darrell, Alexei A. Efros, Oliver Wang, and Eli Shechtman. 2017. Toward multimodal image-to-image translation. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 465–476. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Equivariant Adversarial Network for Image-to-image Translation

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Multimedia Computing, Communications, and Applications
      ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 17, Issue 2s
      June 2021
      349 pages
      ISSN:1551-6857
      EISSN:1551-6865
      DOI:10.1145/3465440
      Issue’s Table of Contents

      Copyright © 2021 Association for Computing Machinery.

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 14 June 2021
      • Revised: 1 March 2021
      • Accepted: 1 March 2021
      • Received: 1 August 2020
      Published in tomm Volume 17, Issue 2s

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format