Abstract
Image-to-Image translation aims to learn an image from a source domain to a target domain. However, there are three main challenges, such as lack of paired datasets, multimodality, and diversity, that are associated with these problems and need to be dealt with. Convolutional neural networks (CNNs), despite of having great performance in many computer vision tasks, they fail to detect the hierarchy of spatial relationships between different parts of an object and thus do not form the ideal representative model we look for. This article presents a new variation of generative models that aims to remedy this problem. We use a trainable transformer, which explicitly allows the spatial manipulation of data within training. This differentiable module can be augmented into the convolutional layers in the generative model, and it allows to freely alter the generated distributions for image-to-image translation. To reap the benefits of proposed module into generative model, our architecture incorporates a new loss function to facilitate an effective end-to-end generative learning for image-to-image translation. The proposed model is evaluated through comprehensive experiments on image synthesizing and image-to-image translation, along with comparisons with several state-of-the-art algorithms.
- Matthew Amodio and Smita Krishnaswamy. 2019. TravelGAN: Image-to-image translation by transformation vector learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8983–8992.Google ScholarCross Ref
- Yogesh Balaji, Hamed Hassani, Rama Chellappa, and Soheil Feizi. 2018. Entropic GANs meet VAEs: A statistical approach to compute sample likelihoods in GANs. arXiv preprint arXiv:1810.04147 (2018).Google Scholar
- Cher Bass, Tianhong Dai, Benjamin Billot, Kai Arulkumaran, Antonia Creswell, Claudia Clopath, Vincenzo De Paola, and Anil Anthony Bharath. 2019. Image synthesis with a convolutional capsule generative adversarial network. In International Conference on Medical Imaging with Deep Learning. PMLR, 39–62.Google Scholar
- Matan Ben-Yosef and Daphna Weinshall. 2018. Gaussian mixture generative adversarial networks for diverse datasets, and the unsupervised clustering of images. arXiv preprint arXiv:1808.10356 (2018).Google Scholar
- Charlotte Bunne, David Alvarez-Melis, Andreas Krause, and Stefanie Jegelka. 2019. Learning generative models across incomparable spaces. arXiv preprint arXiv:1905.05461 (2019).Google Scholar
- Huiwen Chang, Jingwan Lu, Fisher Yu, and Adam Finkelstein. 2018. PairedCycleGAN: Asymmetric style transfer for applying and removing makeup. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 40–48.Google ScholarCross Ref
- William Fedus, Mihaela Rosca, Balaji Lakshminarayanan, Andrew M. Dai, Shakir Mohamed, and Ian Goodfellow. 2017. Many paths to equilibrium: GANs do not need to decrease a divergence at every step. arXiv preprint arXiv:1710.08446 (2017).Google Scholar
- Aude Genevay, Gabriel Peyré, and Marco Cuturi. 2017. Learning generative models with Sinkhorn divergences. arXiv preprint arXiv:1706.00292 (2017).Google Scholar
- Abel Gonzalez-Garcia, Joost Van De Weijer, and Yoshua Bengio. 2018. Image-to-image translation for cross-domain disentanglement. In Proceedings of the International Conference on Advances in Neural Information Processing Systems.1287–1298. Google ScholarDigital Library
- Ian Goodfellow. 2016. NIPS 2016 tutorial: Generative adversarial networks. arXiv preprint arXiv:1701.00160 (2016).Google Scholar
- Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, and Aaron C. Courville. 2017. Improved training of Wasserstein GANs. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 5767–5777. Google ScholarDigital Library
- Uiwon Hwang, Dahuin Jung, and Sungroh Yoon. 2019. HexaGAN: Generative adversarial nets for real world classification. arXiv preprint arXiv:1902.09913 (2019).Google Scholar
- Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. 2017. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1125–1134.Google Scholar
- Max Jaderberg, Karen Simonyan, Andrew Zisserman, and Koray Kavukcuoglu. 2015. Spatial transformer networks. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 2017–2025. Google ScholarDigital Library
- Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. 2017. Progressive growing of gans for improved quality, stability, and variation. In Proceedings of the International Conference on Learning Representations.Google Scholar
- Taeksoo Kim, Moonsu Cha, Hyunsoo Kim, Jung Kwon Lee, and Jiwon Kim. 2017. Learning to discover cross-domain relations with generative adversarial networks. In Proceedings of the 34th International Conference on Machine Learning. JMLR.org, 1857–1865. Google ScholarDigital Library
- Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google Scholar
- Alex Krizhevsky, Geoffrey Hinton, et al. 2009. Learning multiple layers of features from tiny images. Master's thesis. Department of Computer Science, University of Toronto.Google Scholar
- Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278–2324.Google ScholarCross Ref
- Christian Ledig, Lucas Theis, Ferenc Huszár, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, et al. 2017. Photo-realistic single image super-resolution using a generative adversarial network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4681–4690.Google ScholarCross Ref
- Hsin-Ying Lee, Hung-Yu Tseng, Jia-Bin Huang, Maneesh Singh, and Ming-Hsuan Yang. 2018. Diverse image-to-image translation via disentangled representations. In Proceedings of the European Conference on Computer Vision (ECCV). 35–51.Google ScholarDigital Library
- Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. 2015. Deep learning face attributes in the wild. In Proceedings of the IEEE International Conference on Computer Vision. 3730–3738. Google ScholarDigital Library
- Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3431–3440.Google ScholarCross Ref
- Pauline Luc, Camille Couprie, Soumith Chintala, and Jakob Verbeek. 2016. Semantic segmentation using adversarial networks. arXiv preprint arXiv:1611.08408 (2016).Google Scholar
- Xudong Mao, Qing Li, Haoran Xie, Raymond Y. K. Lau, Zhen Wang, and Stephen Paul Smolley. 2018. On the effectiveness of least squares generative adversarial networks. IEEE Trans. Pattern Anal. Mach. Intell. 41, 12 (2018), 2947–2960.Google ScholarCross Ref
- Youssef Alami Mejjati, Christian Richardt, James Tompkin, Darren Cosker, and Kwang In Kim. 2018. Unsupervised attention-guided image-to-image translation. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 3693–3703. Google ScholarDigital Library
- Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y. Ng. 2011. Reading digits in natural images with unsupervised feature learning. (2011).Google Scholar
- Augustus Odena, Christopher Olah, and Jonathon Shlens. 2017. Conditional image synthesis with auxiliary classifier gans. In Proceedings of the 34th International Conference on Machine Learning. JMLR.org, 2642–2651. Google ScholarDigital Library
- Gabriel Peyré, Marco Cuturi, and Justin Solomon. 2016. Gromov-Wasserstein averaging of kernel and distance matrices. In Proceedings of the International Conference on Machine Learning. 2664–2672. Google ScholarDigital Library
- Alec Radford, Luke Metz, and Soumith Chintala. 2015. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015).Google Scholar
- Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. 2016. Improved techniques for training GANs. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 2234–2242. Google ScholarDigital Library
- Tim Salimans, Han Zhang, Alec Radford, and Dimitris Metaxas. 2018. Improving GANs using optimal transport. arXiv preprint arXiv:1803.05573 (2018).Google Scholar
- Pourya Shamsolmoali, Masoumeh Zareapoor, Eric Granger, Huiyu Zhou, Ruili Wang, M Emre Celebi, and Jie Yang. 2021. Image synthesis with adversarial networks: A comprehensive survey and case studies. Inf. Fus. (2021).Google Scholar
- Pourya Shamsolmoali, Masoumeh Zareapoor, Linlin Shen, Abdul Hamid Sadka, and Jie Yang. 2020. Imbalanced data learning by minority class augmentation using capsule adversarial networks. Neurocomputing (2020).Google Scholar
- Pourya Shamsolmoali, Masoumeh Zareapoor, Ruili Wang, Deepak Kumar Jain, and Jie Yang. 2019. G-GANISR: Gradual generative adversarial network for image super resolution. Neurocomputing 366 (2019), 140–153.Google ScholarDigital Library
- Pourya Shamsolmoali, Masoumeh Zareapoor, Huiyu Zhou, and Jie Yang. 2020. AMIL: Adversarial multi-instance learning for human pose estimation. ACM Trans. Multim. Comput. Commun. Applic. 16, 1s (2020), 1–23. Google ScholarDigital Library
- Zhengwei Wang, Qi She, and Tomas E. Ward. 2019. Generative adversarial networks in computer vision: A survey and taxonomy. arXiv preprint arXiv:1906.01529 (2019). Google ScholarDigital Library
- Jerry Wei, Arief Suriawinata, Louis Vaickus, Bing Ren, Xiaoying Liu, Jason Wei, and Saeed Hassanpour. 2019. Generative image translation for data augmentation in colorectal histopathology images. arXiv preprint arXiv:1910.05827 (2019).Google Scholar
- Karren D. Yang and Caroline Uhler. 2018. Scalable unbalanced optimal transport using generative adversarial networks. arXiv preprint arXiv:1810.11447 (2018).Google Scholar
- Zichao Yang, Xiaodong He, Jianfeng Gao, Li Deng, and Alex Smola. 2016. Stacked attention networks for image question answering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 21–29.Google ScholarCross Ref
- Zili Yi, Hao Zhang, Ping Tan, and Minglun Gong. 2017. DualGAN: Unsupervised dual learning for image-to-image translation. In Proceedings of the IEEE International Conference on Computer Vision. 2849–2857.Google ScholarCross Ref
- Masoumeh Zareapoor, Pourya Shamsolmoali, and Jie Yang. 2021. Oversampling adversarial network for class-imbalanced fault diagnosis. Mech. Syst. Sig. Proc. 149 (2021), 107175.Google ScholarCross Ref
- Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision. 2223–2232.Google Scholar
- Jun-Yan Zhu, Richard Zhang, Deepak Pathak, Trevor Darrell, Alexei A. Efros, Oliver Wang, and Eli Shechtman. 2017. Toward multimodal image-to-image translation. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 465–476. Google ScholarDigital Library
Index Terms
- Equivariant Adversarial Network for Image-to-image Translation
Recommendations
Co-DGAN: cooperating discriminator generative adversarial networks for unpaired image-to-image translation
AbstractRecent studies based on generative adversarial networks (GAN) have shown remarkable success in unpaired image-to-image translation, the key idea of which is to translate images from a source domain to a target domain. However, these prior studies ...
SemiStarGAN: Semi-supervised Generative Adversarial Networks for Multi-domain Image-to-Image Translation
Computer Vision – ACCV 2018AbstractRecent studies have shown significant advance for multi-domain image-to-image translation, and generative adversarial networks (GANs) are widely used to address this problem. However, to train an effective image generator, existing methods all ...
Research on Image-to-Image Translation with Capsule Network
Artificial Neural Networks and Machine Learning – ICANN 2019: Theoretical Neural ComputationAbstractDeep learning technologies provide a unified translation framework for image-to-image translation. In particular, Convolution Neural Network (CNN) plays a decisive role because of its remarkable flexibility and performance. Recently, a new ...
Comments