当前位置: X-MOL 学术IEEE Trans. Image Process. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
BSD-GAN: Branched Generative Adversarial Network for Scale-Disentangled Representation Learning and Image Synthesis.
IEEE Transactions on Image Processing ( IF 10.8 ) Pub Date : 2020-08-12 , DOI: 10.1109/tip.2020.3014608
Zili Yi , Zhiqin Chen , Hao Cai , Wendong Mao , Minglun Gong , Hao Zhang

We introduce BSD-GAN , a novel multi-branch and scale-disentangled training method which enables unconditional Generative Adversarial Networks (GANs) to learn image representations at multiple scales , benefiting a wide range of generation and editing tasks. The key feature of BSD-GAN is that it is trained in multiple branches, progressively covering both the breadth and depth of the network, as resolutions of the training images increase to reveal finer-scale features. Specifically, each noise vector, as input to the generator network of BSD-GAN, is deliberately split into several sub-vectors, each corresponding to, and is trained to learn, image representations at a particular scale. During training, we progressively “de-freeze” the sub-vectors, one at a time, as a new set of higher-resolution images is employed for training and more network layers are added. A consequence of such an explicit sub-vector designation is that we can directly manipulate and even combine latent (sub-vector) codes which model different feature scales. Extensive experiments demonstrate the effectiveness of our training method in scale-disentangled learning of image representations and synthesis of novel image contents, without any extra labels and without compromising quality of the synthesized high-resolution images. We further demonstrate several image generation and manipulation applications enabled or improved by BSD-GAN.

中文翻译:


BSD-GAN:用于尺度解缠表示学习和图像合成的分支生成对抗网络。



我们介绍BSD-GAN ,一本小说多分支和尺度解开训练方法,使无条件生成对抗网络(GAN)能够学习图像表示多尺度,有利于广泛的生成和编辑任务。 BSD-GAN 的关键特征是它在多个分支中进行训练,随着训练图像分辨率的增加以揭示更精细的特征,逐渐覆盖网络的广度和深度。具体来说,每个噪声向量作为 BSD-GAN 生成器网络的输入,是故意地分成几个子向量,每个子向量对应于特定比例的图像表示,并被训练来学习特定比例的图像表示。在训练过程中,我们逐渐“解冻”子向量,一次一个,因为使用一组新的更高分辨率图像进行训练并添加更多网络层。这种明确的子向量指定的结果是我们可以直接地操纵甚至组合对不同特征尺度进行建模的潜在(子向量)代码。大量的实验证明了我们的训练方法的有效性尺度解开学习图像表示和合成新颖的图像内容,无需任何额外的标签,并且不会影响合成的高分辨率图像的质量。我们进一步演示了 BSD-GAN 启用或改进的几种图像生成和操作应用程序。
更新日期:2020-09-29
down
wechat
bug