SA-SinGAN: self-attention for single-image generation adversarial networks,Machine Vision and Applications

当前位置： X-MOL 学术 › Mach. Vis. Appl. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

SA-SinGAN: self-attention for single-image generation adversarial networks
Machine Vision and Applications ( IF 3.3 ) Pub Date : 2021-07-09 , DOI: 10.1007/s00138-021-01228-z
Xi Chen ₁ , Hongdong Zhao ₁ , Dongxu Yang ₁ , Qing Kang ₁ , Haiyan Lu ₁ , Yueyuan Li ₂

Affiliation

Single-image training is a research hotspot task of generating adversarial networks, especially in tasks such as image editing and image coordination. However, the existing network has a series of problems such as a long training time, poor image quality, and an unstable training model. Based on the research hot issues, we propose a single-image generation adversarial network of the self-attention mechanism and discuss the changes of the model when the self-attention mechanism is placed in different positions of the generator. We introduced the spectral normalization in the generator and discriminator networks to stabilize the training process and compared the influence of the learning rate on the network. We used artificial vision and model evaluation methods to test the performance of the model on three representative datasets and compared with the current more advanced models. Experiments show that our proposed model has better performance than single-sample generative adversarial networks, reducing Single Image Fréchet Inception Distance (SIFID) from 4.80 to 2.057 on the challenging Generation datasets, reducing SIFID from 0.06 to 0.02 on the Places datasets, and reducing SIFID from 0.23 to 0.04 on the LSUN datasets. The training time of our model is one-ninth of the single-sample generation adversarial network, which can obtain the overall structure of the single training sample, which has great research significance.

中文翻译：

SA-SinGAN：单图像生成对抗网络的自注意力

单幅图像训练是生成对抗网络的研究热点任务，尤其是在图像编辑和图像协调等任务中。然而，现有网络存在训练时间长、图像质量差、训练模型不稳定等一系列问题。基于研究热点问题，我们提出了一种自注意力机制的单图像生成对抗网络，并讨论了当自注意力机制放置在生成器的不同位置时模型的变化。我们在生成器和鉴别器网络中引入了谱归一化以稳定训练过程，并比较了学习率对网络的影响。我们使用人工视觉和模型评估方法在三个具有代表性的数据集上测试模型的性能，并与当前更先进的模型进行比较。实验表明，我们提出的模型比单样本生成对抗网络具有更好的性能，在具有挑战性的 Generation 数据集上将 Single Image Fréchet Inception Distance (SIFID) 从 4.80 减少到 2.057，在 Places 数据集上将 SIFID 从 0.06 减少到 0.02，并减少 SIFID在 LSUN 数据集上从 0.23 到 0.04。我们模型的训练时间是单样本生成对抗网络的九分之一，可以得到单训练样本的整体结构，具有很大的研究意义。实验表明，我们提出的模型比单样本生成对抗网络具有更好的性能，在具有挑战性的 Generation 数据集上将 Single Image Fréchet Inception Distance (SIFID) 从 4.80 减少到 2.057，在 Places 数据集上将 SIFID 从 0.06 减少到 0.02，并减少 SIFID在 LSUN 数据集上从 0.23 到 0.04。我们模型的训练时间是单样本生成对抗网络的九分之一，可以得到单训练样本的整体结构，具有很大的研究意义。实验表明，我们提出的模型比单样本生成对抗网络具有更好的性能，在具有挑战性的 Generation 数据集上将 Single Image Fréchet Inception Distance (SIFID) 从 4.80 减少到 2.057，在 Places 数据集上将 SIFID 从 0.06 减少到 0.02，并减少 SIFID在 LSUN 数据集上从 0.23 到 0.04。我们模型的训练时间是单样本生成对抗网络的九分之一，可以得到单训练样本的整体结构，具有很大的研究意义。

更新日期：2021-07-12

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>