当前位置: X-MOL 学术IEEE Trans. Pattern Anal. Mach. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Pixel Transposed Convolutional Networks
IEEE Transactions on Pattern Analysis and Machine Intelligence ( IF 20.8 ) Pub Date : 1-18-2019 , DOI: 10.1109/tpami.2019.2893965
Hongyang Gao , Hao Yuan , Zhengyang Wang , Shuiwang Ji

Transposed convolutional layers have been widely used in a variety of deep models for up-sampling, including encoder-decoder networks for semantic segmentation and deep generative models for unsupervised learning. One of the key limitations of transposed convolutional operations is that they result in the so-called checkerboard problem. This is caused by the fact that no direct relationship exists among adjacent pixels on the output feature map. To address this problem, we propose the pixel transposed convolutional layer (PixelTCL) to establish direct relationships among adjacent pixels on the up-sampled feature map. Our method is based on a fresh interpretation of the regular transposed convolutional operation. The resulting PixelTCL can be used to replace any transposed convolutional layer in a plug-and-play manner without compromising the fully trainable capabilities of original models. The proposed PixelTCL may result in slight decrease in efficiency, but this can be overcome by an implementation trick. Experimental results on semantic segmentation demonstrate that PixelTCL can consider spatial features such as edges and shapes and yields more accurate segmentation outputs than transposed convolutional layers. When used in image generation tasks, our PixelTCL can largely overcome the checkerboard problem suffered by regular transposed convolutional operations.

中文翻译:


像素转置卷积网络



转置卷积层已广泛应用于各种上采样深度模型,包括用于语义分割的编码器-解码器网络和用于无监督学习的深度生成模型。转置卷积运算的主要限制之一是它们会导致所谓的棋盘问题。这是由于输出特征图上的相邻像素之间不存在直接关系造成的。为了解决这个问题,我们提出了像素转置卷积层(PixelTCL)来建立上采样特征图上相邻像素之间的直接关系。我们的方法基于对常规转置卷积运算的全新解释。由此产生的 PixelTCL 可用于以即插即用的方式替换任何转置卷积层,而不会影响原始模型的完全可训练能力。所提出的 PixelTCL 可能会导致效率略有下降,但这可以通过实现技巧来克服。语义分割的实验结果表明,PixelTCL 可以考虑边缘和形状等空间特征,并产生比转置卷积层更准确的分割输出。当用于图像生成任务时,我们的 PixelTCL 可以在很大程度上克服常规转置卷积运算所遇到的棋盘问题。
更新日期:2024-08-22
down
wechat
bug