research-article

Equivariant Adversarial Network for Image-to-image Translation

Authors:
Masoumeh Zareapoor

Institute of Image Processing and Pattern Recognition, Shanghai Jiaotong University, Shanghai, China

Institute of Image Processing and Pattern Recognition, Shanghai Jiaotong University, Shanghai, China
View Profile

,
Jie Yang

Institute of Image Processing and Pattern Recognition, Shanghai Jiaotong University, Shanghai, China

Institute of Image Processing and Pattern Recognition, Shanghai Jiaotong University, Shanghai, China
View Profile

ACM Transactions on Multimedia Computing, Communications, and Applications Volume 17 Issue 2sArticle No.: 73pp 1–14https://doi.org/10.1145/3458280

Published:14 June 2021Publication History

ACM Transactions on Multimedia Computing, Communications, and Applications

Abstract

Image-to-Image translation aims to learn an image from a source domain to a target domain. However, there are three main challenges, such as lack of paired datasets, multimodality, and diversity, that are associated with these problems and need to be dealt with. Convolutional neural networks (CNNs), despite of having great performance in many computer vision tasks, they fail to detect the hierarchy of spatial relationships between different parts of an object and thus do not form the ideal representative model we look for. This article presents a new variation of generative models that aims to remedy this problem. We use a trainable transformer, which explicitly allows the spatial manipulation of data within training. This differentiable module can be augmented into the convolutional layers in the generative model, and it allows to freely alter the generated distributions for image-to-image translation. To reap the benefits of proposed module into generative model, our architecture incorporates a new loss function to facilitate an effective end-to-end generative learning for image-to-image translation. The proposed model is evaluated through comprehensive experiments on image synthesizing and image-to-image translation, along with comparisons with several state-of-the-art algorithms.

References

Matthew Amodio and Smita Krishnaswamy. 2019. TravelGAN: Image-to-image translation by transformation vector learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8983–8992.Google ScholarCross Ref
Yogesh Balaji, Hamed Hassani, Rama Chellappa, and Soheil Feizi. 2018. Entropic GANs meet VAEs: A statistical approach to compute sample likelihoods in GANs. arXiv preprint arXiv:1810.04147 (2018).Google Scholar
Cher Bass, Tianhong Dai, Benjamin Billot, Kai Arulkumaran, Antonia Creswell, Claudia Clopath, Vincenzo De Paola, and Anil Anthony Bharath. 2019. Image synthesis with a convolutional capsule generative adversarial network. In International Conference on Medical Imaging with Deep Learning. PMLR, 39–62.Google Scholar
Matan Ben-Yosef and Daphna Weinshall. 2018. Gaussian mixture generative adversarial networks for diverse datasets, and the unsupervised clustering of images. arXiv preprint arXiv:1808.10356 (2018).Google Scholar
Charlotte Bunne, David Alvarez-Melis, Andreas Krause, and Stefanie Jegelka. 2019. Learning generative models across incomparable spaces. arXiv preprint arXiv:1905.05461 (2019).Google Scholar
Huiwen Chang, Jingwan Lu, Fisher Yu, and Adam Finkelstein. 2018. PairedCycleGAN: Asymmetric style transfer for applying and removing makeup. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 40–48.Google ScholarCross Ref
William Fedus, Mihaela Rosca, Balaji Lakshminarayanan, Andrew M. Dai, Shakir Mohamed, and Ian Goodfellow. 2017. Many paths to equilibrium: GANs do not need to decrease a divergence at every step. arXiv preprint arXiv:1710.08446 (2017).Google Scholar
Aude Genevay, Gabriel Peyré, and Marco Cuturi. 2017. Learning generative models with Sinkhorn divergences. arXiv preprint arXiv:1706.00292 (2017).Google Scholar
Abel Gonzalez-Garcia, Joost Van De Weijer, and Yoshua Bengio. 2018. Image-to-image translation for cross-domain disentanglement. In Proceedings of the International Conference on Advances in Neural Information Processing Systems.1287–1298. Google ScholarDigital Library
Ian Goodfellow. 2016. NIPS 2016 tutorial: Generative adversarial networks. arXiv preprint arXiv:1701.00160 (2016).Google Scholar
Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, and Aaron C. Courville. 2017. Improved training of Wasserstein GANs. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 5767–5777. Google ScholarDigital Library
Uiwon Hwang, Dahuin Jung, and Sungroh Yoon. 2019. HexaGAN: Generative adversarial nets for real world classification. arXiv preprint arXiv:1902.09913 (2019).Google Scholar
Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. 2017. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1125–1134.Google Scholar
Max Jaderberg, Karen Simonyan, Andrew Zisserman, and Koray Kavukcuoglu. 2015. Spatial transformer networks. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 2017–2025. Google ScholarDigital Library
Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. 2017. Progressive growing of gans for improved quality, stability, and variation. In Proceedings of the International Conference on Learning Representations.Google Scholar
Taeksoo Kim, Moonsu Cha, Hyunsoo Kim, Jung Kwon Lee, and Jiwon Kim. 2017. Learning to discover cross-domain relations with generative adversarial networks. In Proceedings of the 34th International Conference on Machine Learning. JMLR.org, 1857–1865. Google ScholarDigital Library
Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google Scholar
Alex Krizhevsky, Geoffrey Hinton, et al. 2009. Learning multiple layers of features from tiny images. Master's thesis. Department of Computer Science, University of Toronto.Google Scholar
Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278–2324.Google ScholarCross Ref
Christian Ledig, Lucas Theis, Ferenc Huszár, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, et al. 2017. Photo-realistic single image super-resolution using a generative adversarial network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4681–4690.Google ScholarCross Ref
Hsin-Ying Lee, Hung-Yu Tseng, Jia-Bin Huang, Maneesh Singh, and Ming-Hsuan Yang. 2018. Diverse image-to-image translation via disentangled representations. In Proceedings of the European Conference on Computer Vision (ECCV). 35–51.Google ScholarDigital Library
Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. 2015. Deep learning face attributes in the wild. In Proceedings of the IEEE International Conference on Computer Vision. 3730–3738. Google ScholarDigital Library
Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3431–3440.Google ScholarCross Ref
Pauline Luc, Camille Couprie, Soumith Chintala, and Jakob Verbeek. 2016. Semantic segmentation using adversarial networks. arXiv preprint arXiv:1611.08408 (2016).Google Scholar
Xudong Mao, Qing Li, Haoran Xie, Raymond Y. K. Lau, Zhen Wang, and Stephen Paul Smolley. 2018. On the effectiveness of least squares generative adversarial networks. IEEE Trans. Pattern Anal. Mach. Intell. 41, 12 (2018), 2947–2960.Google ScholarCross Ref
Youssef Alami Mejjati, Christian Richardt, James Tompkin, Darren Cosker, and Kwang In Kim. 2018. Unsupervised attention-guided image-to-image translation. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 3693–3703. Google ScholarDigital Library
Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y. Ng. 2011. Reading digits in natural images with unsupervised feature learning. (2011).Google Scholar
Augustus Odena, Christopher Olah, and Jonathon Shlens. 2017. Conditional image synthesis with auxiliary classifier gans. In Proceedings of the 34th International Conference on Machine Learning. JMLR.org, 2642–2651. Google ScholarDigital Library
Gabriel Peyré, Marco Cuturi, and Justin Solomon. 2016. Gromov-Wasserstein averaging of kernel and distance matrices. In Proceedings of the International Conference on Machine Learning. 2664–2672. Google ScholarDigital Library
Alec Radford, Luke Metz, and Soumith Chintala. 2015. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015).Google Scholar
Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. 2016. Improved techniques for training GANs. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 2234–2242. Google ScholarDigital Library
Tim Salimans, Han Zhang, Alec Radford, and Dimitris Metaxas. 2018. Improving GANs using optimal transport. arXiv preprint arXiv:1803.05573 (2018).Google Scholar
Pourya Shamsolmoali, Masoumeh Zareapoor, Eric Granger, Huiyu Zhou, Ruili Wang, M Emre Celebi, and Jie Yang. 2021. Image synthesis with adversarial networks: A comprehensive survey and case studies. Inf. Fus. (2021).Google Scholar
Pourya Shamsolmoali, Masoumeh Zareapoor, Linlin Shen, Abdul Hamid Sadka, and Jie Yang. 2020. Imbalanced data learning by minority class augmentation using capsule adversarial networks. Neurocomputing (2020).Google Scholar
Pourya Shamsolmoali, Masoumeh Zareapoor, Ruili Wang, Deepak Kumar Jain, and Jie Yang. 2019. G-GANISR: Gradual generative adversarial network for image super resolution. Neurocomputing 366 (2019), 140–153.Google ScholarDigital Library
Pourya Shamsolmoali, Masoumeh Zareapoor, Huiyu Zhou, and Jie Yang. 2020. AMIL: Adversarial multi-instance learning for human pose estimation. ACM Trans. Multim. Comput. Commun. Applic. 16, 1s (2020), 1–23. Google ScholarDigital Library
Zhengwei Wang, Qi She, and Tomas E. Ward. 2019. Generative adversarial networks in computer vision: A survey and taxonomy. arXiv preprint arXiv:1906.01529 (2019). Google ScholarDigital Library
Jerry Wei, Arief Suriawinata, Louis Vaickus, Bing Ren, Xiaoying Liu, Jason Wei, and Saeed Hassanpour. 2019. Generative image translation for data augmentation in colorectal histopathology images. arXiv preprint arXiv:1910.05827 (2019).Google Scholar
Karren D. Yang and Caroline Uhler. 2018. Scalable unbalanced optimal transport using generative adversarial networks. arXiv preprint arXiv:1810.11447 (2018).Google Scholar
Zichao Yang, Xiaodong He, Jianfeng Gao, Li Deng, and Alex Smola. 2016. Stacked attention networks for image question answering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 21–29.Google ScholarCross Ref
Zili Yi, Hao Zhang, Ping Tan, and Minglun Gong. 2017. DualGAN: Unsupervised dual learning for image-to-image translation. In Proceedings of the IEEE International Conference on Computer Vision. 2849–2857.Google ScholarCross Ref
Masoumeh Zareapoor, Pourya Shamsolmoali, and Jie Yang. 2021. Oversampling adversarial network for class-imbalanced fault diagnosis. Mech. Syst. Sig. Proc. 149 (2021), 107175.Google ScholarCross Ref
Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision. 2223–2232.Google Scholar
Jun-Yan Zhu, Richard Zhang, Deepak Pathak, Trevor Darrell, Alexei A. Efros, Oliver Wang, and Eli Shechtman. 2017. Toward multimodal image-to-image translation. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 465–476. Google ScholarDigital Library

Index Terms

Equivariant Adversarial Network for Image-to-image Translation
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision

Recommendations

Co-DGAN: cooperating discriminator generative adversarial networks for unpaired image-to-image translation
Abstract
Recent studies based on generative adversarial networks (GAN) have shown remarkable success in unpaired image-to-image translation, the key idea of which is to translate images from a source domain to a target domain. However, these prior studies ...
Read More
SemiStarGAN: Semi-supervised Generative Adversarial Networks for Multi-domain Image-to-Image Translation
Computer Vision – ACCV 2018
Abstract
Recent studies have shown significant advance for multi-domain image-to-image translation, and generative adversarial networks (GANs) are widely used to address this problem. However, to train an effective image generator, existing methods all ...
Read More
Research on Image-to-Image Translation with Capsule Network
Artificial Neural Networks and Machine Learning – ICANN 2019: Theoretical Neural Computation
Abstract
Deep learning technologies provide a unified translation framework for image-to-image translation. In particular, Convolution Neural Network (CNN) plays a decisive role because of its remarkable flexibility and performance. Recently, a new ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Multimedia Computing, Communications, and Applications Volume 17, Issue 2s
June 2021
349 pages
ISSN:1551-6857
EISSN:1551-6865
DOI:10.1145/3465440
Editor:
Alberto Del Bimbo
University of Firenze, Italy
Issue’s Table of Contents
Copyright © 2021 Association for Computing Machinery.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 14 June 2021
- Revised: 1 March 2021
- Accepted: 1 March 2021
- Received: 1 August 2020
Published in tomm Volume 17, Issue 2s

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Stylistic image generation
image-to-image translation
generative model
domain adaptation
Qualifiers
- research-article
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 4
  Total Citations
  View Citations
- 272
  Total Downloads
- Downloads (Last 12 months)44
- Downloads (Last 6 weeks)8
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Equivariant Adversarial Network for Image-to-image Translation

ACM Transactions on Multimedia Computing, Communications, and Applications

Abstract

References

Cited By

Index Terms

Recommendations

Co-DGAN: cooperating discriminator generative adversarial networks for unpaired image-to-image translation

SemiStarGAN: Semi-supervised Generative Adversarial Networks for Multi-domain Image-to-Image Translation

Research on Image-to-Image Translation with Capsule Network