Multilevel feature fusion dilated convolutional network for semantic segmentation,International Journal of Advanced Robotic Systems

当前位置： X-MOL 学术 › Int. J. Adv. Robot. Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Multilevel feature fusion dilated convolutional network for semantic segmentation
International Journal of Advanced Robotic Systems ( IF 2.1 ) Pub Date : 2021-04-15 , DOI: 10.1177/17298814211007665
Tao Ku _{1,

2} , Qirui Yang _{1,

2,

3} , Hao Zhang _{1,

2}

Affiliation

Recently, convolutional neural network (CNN) has led to significant improvement in the field of computer vision, especially the improvement of the accuracy and speed of semantic segmentation tasks, which greatly improved robot scene perception. In this article, we propose a multilevel feature fusion dilated convolution network (Refine-DeepLab). By improving the space pyramid pooling structure, we propose a multiscale hybrid dilated convolution module, which captures the rich context information and effectively alleviates the contradiction between the receptive field size and the dilated convolution operation. At the same time, the high-level semantic information and low-level semantic information obtained through multi-level and multi-scale feature extraction can effectively improve the capture of global information and improve the performance of large-scale target segmentation. The encoder–decoder gradually recovers spatial information while capturing high-level semantic information, resulting in sharper object boundaries. Extensive experiments verify the effectiveness of our proposed Refine-DeepLab model, evaluate our approaches thoroughly on the PASCAL VOC 2012 data set without MS COCO data set pretraining, and achieve a state-of-art result of 81.73% mean interaction-over-union in the validate set.

中文翻译：

多级特征融合扩张卷积网络的语义分割

最近，卷积神经网络（CNN）导致了计算机视觉领域的显着改善，特别是语义分割任务的准确性和速度的提高，极大地改善了机器人的场景感知能力。在本文中，我们提出了一种多层融合融合卷积网络（Refine-DeepLab）。通过改进空间金字塔池结构，我们提出了一种多尺度混合扩张卷积模块，该模块捕获丰富的上下文信息，并有效缓解了接收场大小与扩张卷积运算之间的矛盾。同时，通过多层次，多尺度特征提取获得的高层语义信息和底层语义信息可以有效地提高全局信息的捕获能力，提高大规模目标分割的性能。编解码器在捕获高级语义信息的同时逐渐恢复空间信息，从而导致对象边界更加清晰。大量实验验证了我们提出的Refine-DeepLab模型的有效性，在没有MS COCO数据集预训练的情况下，在PASCAL VOC 2012数据集上全面评估了我们的方法，并获得了81.73％的最新平均结果，即验证集。导致更清晰的对象边界。大量实验验证了我们提出的Refine-DeepLab模型的有效性，在没有MS COCO数据集预训练的情况下，在PASCAL VOC 2012数据集上全面评估了我们的方法，并获得了81.73％的最新平均结果，即验证集。导致更清晰的对象边界。大量实验验证了我们提出的Refine-DeepLab模型的有效性，在没有MS COCO数据集预训练的情况下，在PASCAL VOC 2012数据集上全面评估了我们的方法，并获得了81.73％的最新平均结果，即验证集。

更新日期：2021-04-15

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11