当前位置: X-MOL 学术IEEE Trans. Pattern Anal. Mach. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Efficient Semantic Image Synthesis via Class-Adaptive Normalization.
IEEE Transactions on Pattern Analysis and Machine Intelligence ( IF 20.8 ) Pub Date : 2021-04-29 , DOI: 10.1109/tpami.2021.3076487
Zhentao Tan , Dongdong Chen , Qi Chu , Menglei Chai , Jing Liao , Mingming He , Lu Yuan , Gang Hua , Nenghai Yu

Spatially-adaptive normalization (SPADE) is remarkably successful recently in conditional semantic image synthesis, which modulates the normalized activation with spatially-varying transformations learned from semantic layouts, to prevent the semantic information from being washed away. Despite its impressive performance, a more thorough understanding of the advantages inside the box is still highly demanded to help reduce the significant computation and parameter overhead introduced by this novel structure. In this paper, from a return-on-investment point of view, we conduct an in-depth analysis of the effectiveness of this spatially-adaptive normalization and observe that its modulation parameters benefit more from semantic-awareness rather than spatial-adaptiveness, especially for high-resolution input masks. Inspired by this observation, we propose class-adaptive normalization (CLADE), a lightweight but equally-effective variant that is only adaptive to semantic class. In order to further improve spatial-adaptiveness, we introduce intra-class positional map encoding calculated from semantic layouts to modulate the normalization parameters of CLADE and propose a truly spatially-adaptive variant of CLADE, namely CLADE-ICPE. Through extensive experiments on multiple challenging datasets, we demonstrate that the proposed CLADE can be generalized to different SPADE-based methods while achieving comparable generation quality compared to SPADE, but it is much more efficient with fewer extra parameters and lower computational cost.

中文翻译:

通过类自适应归一化的有效语义图像合成。

最近,在条件语义图像合成中,空间自适应归一化(SPADE)取得了显著成功,该条件可通过从语义布局中学到的空间变化变换来调制归一化的激活,以防止语义信息被冲走。尽管其性能令人印象深刻,但仍需要对包装盒内的优点有更透彻的了解,以帮助减少这种新颖结构带来的大量计算和参数开销。在本文中,从投资回报的角度出发,我们对该空间自适应归一化的有效性进行了深入分析,并观察到其调制参数更多地受益于语义感知而非空间自适应,特别是用于高分辨率输入蒙版。受此观察启发,我们提出了类自适应归一化(CLADE),这是一种轻量但同样有效的变体,仅适用于语义类。为了进一步提高空间适应性,我们引入了根据语义布局计算出的类内位置图编码,以调制CLADE的归一化参数,并提出了一种真正的空间适应性CLADE变体,即CLADE-ICPE。通过在多个具有挑战性的数据集上进行的广泛实验,我们证明了所提出的CLADE可以推广到不同的基于SPADE的方法,同时与SPADE相比可以达到可比的生成质量,但是它效率更高,额外参数更少,计算成本更低。为了进一步提高空间适应性,我们引入了根据语义布局计算出的类内位置图编码,以调制CLADE的归一化参数,并提出了一种真正的空间适应性CLADE变体,即CLADE-ICPE。通过在多个具有挑战性的数据集上进行的广泛实验,我们证明了所提出的CLADE可以推广到不同的基于SPADE的方法,同时与SPADE相比可以达到可比的生成质量,但是它效率更高,额外参数更少,计算成本更低。为了进一步提高空间适应性,我们引入了根据语义布局计算出的类内位置图编码,以调制CLADE的归一化参数,并提出了一种真正的空间适应性CLADE变体,即CLADE-ICPE。通过在多个具有挑战性的数据集上进行的广泛实验,我们证明了所提出的CLADE可以推广到不同的基于SPADE的方法,同时与SPADE相比可以达到可比的生成质量,但是它效率更高,额外参数更少,计算成本更低。
更新日期:2021-04-29
down
wechat
bug