Two-branch encoding and iterative attention decoding network for semantic segmentation,Neural Computing and Applications

当前位置： X-MOL 学术 › Neural Comput. & Applic. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Two-branch encoding and iterative attention decoding network for semantic segmentation
Neural Computing and Applications ( IF 4.5 ) Pub Date : 2020-09-01 , DOI: 10.1007/s00521-020-05312-9
Hegui Zhu , Min Zhang , Xiangde Zhang , Libo Zhang

Deep convolutional neural networks(DCNNs) have shown outstanding performance in semantic image segmentation. In this paper, we propose a two-branch encoding and iterative attention decoding semantic segmentation model. In encoding stage, an improved PeleeNet is used as the backbone branch to extract dense image features, and the spatial branch is used to preserve fine-grained information. In decoding stage, the iterative attention decoding is employed to optimize the segmentation results with multi-scale features. Furthermore, we propose a channel position attention module and a boundary residual attention module to learn different position and boundary features, which can enrich the target boundary position information. Finally, we use SegNet as the basic network and conduct some experiments to evaluate the effect of each component in the proposed model with accuracy and mIOU on CamVid dataset. Furthermore, we verify the segmentation performance of the proposed model with comparable experiments on CamVid, Cityscapes and PASCAL VOC 2012 dataset. In particular, the model has achieved 91.7% segmentation accuracy and 67.1% mIOU on the CamVid dataset respectively, which verify the effectiveness of our proposed model. In the future, we can combine target detection with semantic segmentation to further improve the semantic segmentation effect of small objects. We also hope to further optimize the model structure and reduce its time complexities and parameters under the guarantee of effectiveness.

中文翻译：

用于语义分割的两分支编码和迭代注意解码网络

深度卷积神经网络（DCNN）在语义图像分割中表现出出色的性能。在本文中，我们提出了一种两分支编码和迭代注意解码语义分割模型。在编码阶段，将改进的PeleeNet用作主干分支以提取密集的图像特征，并使用空间分支保留细粒度的信息。在解码阶段，采用迭代注意解码来优化具有多尺度特征的分割结果。此外，我们提出了一种信道位置注意模块和边界剩余注意模块，以学习不同的位置和边界特征，从而可以丰富目标边界位置信息。最后，我们使用SegNet作为基本网络，并进行了一些实验，以在CamVid数据集上以准确度和mIOU评估提出的模型中每个组件的效果。此外，我们在CamVid，Cityscapes和PASCAL VOC 2012数据集上进行了可比的实验，验证了所提模型的分割性能。特别是，该模型在CamVid数据集上分别实现了91.7％的分割精度和67.1％的mIOU，这证明了我们提出的模型的有效性。将来，我们可以将目标检测与语义分割相结合，以进一步提高小对象的语义分割效果。我们也希望在有效性保证的情况下进一步优化模型结构，减少模型的时间复杂度和参数。我们在CamVid，Cityscapes和PASCAL VOC 2012数据集上进行了可比的实验，验证了所提出模型的分割性能。特别是，该模型在CamVid数据集上分别实现了91.7％的分割精度和67.1％的mIOU，这证明了我们提出的模型的有效性。将来，我们可以将目标检测与语义分割相结合，以进一步提高小对象的语义分割效果。我们也希望在有效性保证的情况下进一步优化模型结构，减少模型的时间复杂度和参数。我们在CamVid，Cityscapes和PASCAL VOC 2012数据集上进行了可比的实验，验证了所提出模型的分割性能。特别是，该模型在CamVid数据集上分别实现了91.7％的分割精度和67.1％的mIOU，这证明了我们提出的模型的有效性。将来，我们可以将目标检测与语义分割相结合，以进一步提高小对象的语义分割效果。我们也希望在有效性保证的情况下进一步优化模型结构，减少模型的时间复杂度和参数。验证了我们提出的模型的有效性。将来，我们可以将目标检测与语义分割相结合，以进一步提高小对象的语义分割效果。我们也希望在有效性的保证下进一步优化模型结构，减少模型的时间复杂度和参数。验证了我们提出的模型的有效性。将来，我们可以将目标检测与语义分割相结合，以进一步提高小对象的语义分割效果。我们也希望在有效性保证的情况下进一步优化模型结构，减少模型的时间复杂度和参数。

更新日期：2020-09-02

点击分享查看原文

点击收藏

阅读更多本刊最新论文