当前位置: X-MOL 学术IEEE Trans. Image Process. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Image Inpainting by End-to-End Cascaded Refinement With Mask Awareness
IEEE Transactions on Image Processing ( IF 10.6 ) Pub Date : 2021-05-04 , DOI: 10.1109/tip.2021.3076310
Manyu Zhu, Dongliang He, Xin Li, Chao Li, Fu Li, Xiao Liu, Errui Ding, Zhaoxiang Zhang

Inpainting arbitrary missing regions is challenging because learning valid features for various masked regions is nontrivial. Though U-shaped encoder-decoder frameworks have been witnessed to be successful, most of them share a common drawback of mask unawareness in feature extraction because all convolution windows (or regions), including those with various shapes of missing pixels, are treated equally and filtered with fixed learned kernels. To this end, we propose our novel mask-aware inpainting solution. Firstly, a Mask-Aware Dynamic Filtering (MADF) module is designed to effectively learn multi-scale features for missing regions in the encoding phase. Specifically, filters for each convolution window are generated from features of the corresponding region of the mask. The second fold of mask awareness is achieved by adopting Point-wise Normalization (PN) in our decoding phase, considering that statistical natures of features at masked points differentiate from those of unmasked points. The proposed PN can tackle this issue by dynamically assigning point-wise scaling factor and bias. Lastly, our model is designed to be an end-to-end cascaded refinement one. Supervision information such as reconstruction loss, perceptual loss and total variation loss is incrementally leveraged to boost the inpainting results from coarse to fine. Effectiveness of the proposed framework is validated both quantitatively and qualitatively via extensive experiments on three public datasets including Places2, CelebA and Paris StreetView.

中文翻译:

通过端到端级联细化和蒙版感知来修复图像

对任意缺失的区域进行修补是具有挑战性的,因为学习各种蒙版区域的有效特征并非易事。尽管已证明U形编码器-解码器框架是成功的,但由于在所有卷积窗口(或区域)(包括那些形状各异的像素缺失像素)中,所有卷积窗口(或区域)均受到同等对待,因此,大多数U型编码器-解码器框架在特征提取中都存在掩码不明的共同缺点。使用固定的学习内核进行过滤。为此,我们提出了一种新颖的可感知遮罩的修复方案。首先,设计了一种掩码感知动态过滤(MADF)模块,以有效地学习编码阶段中缺失区域的多尺度特征。具体地,从掩模的相应区域的特征生成用于每个卷积窗口的滤波器。考虑到掩蔽点处特征的统计性质与未掩蔽点的统计性质不同,在我们的解码阶段通过采用逐点归一化(PN)可以实现掩盖感知的第二层提升。拟议的PN可以通过动态分配逐点缩放因子和偏差来解决此问题。最后,我们的模型被设计为端到端的级联细化模型。逐步利用诸如重建损失,感知损失和总变化损失之类的监督信息来将修补结果从粗略提高到精细。通过对三个公共数据集(包括Places2,CelebA和Paris StreetView)进行广泛的实验,定量和定性地验证了所提出框架的有效性。考虑到掩蔽点的特征的统计性质与未掩蔽点的特征是不同的。拟议的PN可以通过动态分配逐点缩放因子和偏差来解决此问题。最后,我们的模型被设计为端到端的级联细化模型。逐步利用诸如重建损失,感知损失和总变化损失之类的监督信息来将修补结果从粗略提高到精细。通过对三个公共数据集(包括Places2,CelebA和Paris StreetView)进行广泛的实验,定量和定性地验证了所提出框架的有效性。考虑到掩蔽点的特征的统计性质与未掩蔽点的特征是不同的。拟议的PN可以通过动态分配逐点缩放因子和偏差来解决此问题。最后,我们的模型被设计为端到端的级联细化模型。逐步利用诸如重建损失,感知损失和总变化损失之类的监督信息来将修补结果从粗略提高到精细。通过对三个公共数据集(包括Places2,CelebA和Paris StreetView)进行广泛的实验,定量和定性地验证了所提出框架的有效性。我们的模型被设计为端到端的级联细化模型。逐步利用诸如重建损失,感知损失和总变化损失之类的监督信息来将修补结果从粗略提高到精细。通过对三个公共数据集(包括Places2,CelebA和Paris StreetView)进行广泛的实验,定量和定性地验证了所提出框架的有效性。我们的模型被设计为端到端的级联细化模型。逐步利用诸如重建损失,感知损失和总变化损失之类的监督信息来将修补结果从粗略提高到精细。通过对三个公共数据集(包括Places2,CelebA和Paris StreetView)进行广泛的实验,定量和定性地验证了所提出框架的有效性。
更新日期:2021-05-11
down
wechat
bug