当前位置: X-MOL 学术arXiv.cs.CV › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Transposer: Universal Texture Synthesis Using Feature Maps as Transposed Convolution Filter
arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2020-07-14 , DOI: arxiv-2007.07243
Guilin Liu, Rohan Taori, Ting-Chun Wang, Zhiding Yu, Shiqiu Liu, Fitsum A. Reda, Karan Sapra, Andrew Tao, Bryan Catanzaro

Conventional CNNs for texture synthesis consist of a sequence of (de)-convolution and up/down-sampling layers, where each layer operates locally and lacks the ability to capture the long-term structural dependency required by texture synthesis. Thus, they often simply enlarge the input texture, rather than perform reasonable synthesis. As a compromise, many recent methods sacrifice generalizability by training and testing on the same single (or fixed set of) texture image(s), resulting in huge re-training time costs for unseen images. In this work, based on the discovery that the assembling/stitching operation in traditional texture synthesis is analogous to a transposed convolution operation, we propose a novel way of using transposed convolution operation. Specifically, we directly treat the whole encoded feature map of the input texture as transposed convolution filters and the features' self-similarity map, which captures the auto-correlation information, as input to the transposed convolution. Such a design allows our framework, once trained, to be generalizable to perform synthesis of unseen textures with a single forward pass in nearly real-time. Our method achieves state-of-the-art texture synthesis quality based on various metrics. While self-similarity helps preserve the input textures' regular structural patterns, our framework can also take random noise maps for irregular input textures instead of self-similarity maps as transposed convolution inputs. It allows to get more diverse results as well as generate arbitrarily large texture outputs by directly sampling large noise maps in a single pass as well.

中文翻译:

转置器:使用特征图作为转置卷积过滤器的通用纹理合成

用于纹理合成的传统 CNN 由一系列(反)卷积和上/下采样层组成,其中每一层都在本地运行,并且缺乏捕获纹理合成所需的长期结构依赖性的能力。因此,他们通常只是简单地放大输入纹理,而不是进行合理的合成。作为妥协,许多最近的方法通过在相同的单个(或固定组)纹理图像上进行训练和测试来牺牲泛化性,从而导致对看不见的图像进行巨大的重新训练时间成本。在这项工作中,基于发现传统纹理合成中的组装/拼接操作类似于转置卷积操作,我们提出了一种使用转置卷积操作的新方法。具体来说,我们直接将输入纹理的整个编码特征图视为转置卷积滤波器,将捕获自相关信息的特征自相似图作为转置卷积的输入。这样的设计使我们的框架在经过训练后可以泛化,以近乎实时的单次前向传递来执行看不见的纹理的合成。我们的方法基于各种指标实现了最先进的纹理合成质量。虽然自相似性有助于保留输入纹理的规则结构模式,但我们的框架还可以将不规则输入纹理的随机噪声图而不是自相似图作为转置卷积输入。
更新日期:2020-07-15
down
wechat
bug