Unsupervised multi-modal Styled Content Generation,arXiv - CS - Graphics

当前位置： X-MOL 学术 › arXiv.cs.GR › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Unsupervised multi-modal Styled Content Generation
arXiv - CS - Graphics Pub Date : 2020-01-10 , DOI: arxiv-2001.03640
Omry Sendik, Dani Lischinski, Daniel Cohen-Or

The emergence of deep generative models has recently enabled the automatic generation of massive amounts of graphical content, both in 2D and in 3D. Generative Adversarial Networks (GANs) and style control mechanisms, such as Adaptive Instance Normalization (AdaIN), have proved particularly effective in this context, culminating in the state-of-the-art StyleGAN architecture. While such models are able to learn diverse distributions, provided a sufficiently large training set, they are not well-suited for scenarios where the distribution of the training data exhibits a multi-modal behavior. In such cases, reshaping a uniform or normal distribution over the latent space into a complex multi-modal distribution in the data domain is challenging, and the generator might fail to sample the target distribution well. Furthermore, existing unsupervised generative models are not able to control the mode of the generated samples independently of the other visual attributes, despite the fact that they are typically disentangled in the training data. In this paper, we introduce UMMGAN, a novel architecture designed to better model multi-modal distributions, in an unsupervised fashion. Building upon the StyleGAN architecture, our network learns multiple modes, in a completely unsupervised manner, and combines them using a set of learned weights. We demonstrate that this approach is capable of effectively approximating a complex distribution as a superposition of multiple simple ones. We further show that UMMGAN effectively disentangles between modes and style, thereby providing an independent degree of control over the generated content.

中文翻译：

无监督多模态样式内容生成

最近，深度生成模型的出现使自动生成大量 2D 和 3D 图形内容成为可能。生成对抗网络 (GAN) 和样式控制机制，例如自适应实例归一化 (AdaIN)，已被证明在这种情况下特别有效，最终形成了最先进的 StyleGAN 架构。虽然这些模型能够学习不同的分布，提供足够大的训练集，但它们不太适合训练数据分布表现出多模态行为的场景。在这种情况下，将潜在空间上的均匀或正态分布重塑为数据域中复杂的多模态分布具有挑战性，并且生成器可能无法很好地对目标分布进行采样。此外，现有的无监督生成模型无法独立于其他视觉属性控制生成样本的模式，尽管它们通常在训练数据中解开。在本文中，我们介绍了 UMMGAN，这是一种新颖的架构，旨在以无监督的方式更好地对多模态分布进行建模。基于 StyleGAN 架构，我们的网络以完全无监督的方式学习多种模式，并使用一组学习到的权重将它们组合起来。我们证明这种方法能够有效地将复杂分布近似为多个简单分布的叠加。我们进一步表明 UMMGAN 有效地分离了模式和风格，从而提供了对生成内容的独立控制程度。我们介绍了 UMMGAN，这是一种新颖的架构，旨在以无监督的方式更好地对多模态分布建模。基于 StyleGAN 架构，我们的网络以完全无监督的方式学习多种模式，并使用一组学习到的权重将它们组合起来。我们证明这种方法能够有效地将复杂分布近似为多个简单分布的叠加。我们进一步表明 UMMGAN 有效地分离了模式和风格，从而提供了对生成内容的独立控制程度。我们介绍了 UMMGAN，这是一种新颖的架构，旨在以无监督的方式更好地对多模态分布建模。基于 StyleGAN 架构，我们的网络以完全无监督的方式学习多种模式，并使用一组学习到的权重将它们组合起来。我们证明这种方法能够有效地将复杂分布近似为多个简单分布的叠加。我们进一步表明 UMMGAN 有效地分离了模式和风格，从而提供了对生成内容的独立控制程度。并使用一组学习的权重组合它们。我们证明这种方法能够有效地将复杂分布近似为多个简单分布的叠加。我们进一步表明 UMMGAN 有效地分离了模式和风格，从而提供了对生成内容的独立控制程度。并使用一组学习的权重组合它们。我们证明这种方法能够有效地将复杂分布近似为多个简单分布的叠加。我们进一步表明 UMMGAN 有效地分离了模式和风格，从而提供了对生成内容的独立控制程度。

更新日期：2020-04-28

点击分享查看原文

点击收藏

阅读更多本刊最新论文