当前位置: X-MOL 学术Inform. Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Cross-Attentional Bracket-shaped Convolutional Network for semantic image segmentation
Information Sciences Pub Date : 2020-06-20 , DOI: 10.1016/j.ins.2020.06.023
Cam-Hao Hua , Thien Huynh-The , Sung-Ho Bae , Sungyoung Lee

As perception-related applications are of great importance in industrial production and daily life nowadays, solutions for understanding given images semantically receive numerous attention from the literature. To this end, significant accomplishments have been reached for such pixel-wise segmentation problem thanks to novel manipulations of integrating global context into local details in convolutional neural networks. However, this strategy in the existing work did not exhaustively exploit middle-level features, which carry reasonable balance between fine-grained and semantic information. Therefore, this paper introduces a Cross-Attentional Bracket-shaped Convolutional Network (CAB-Net) to leverage their contribution to the tournament of constructing pixel-wise labeled map. In concrete, fine-to-coarse feature maps of interest from the backbone network are densely combined by an efficient fusion of channel-wisely and spatially attentional schemes in crossing manner, namely Cross-Attentional Fusion, to embed semantically rich features into finer patterns. Continuously, these newly decoded outputs repeat the same procedure round-by-round until shaping a final feature map having finest resolution for complete scene understanding. Consequently, the proposed CAB-Net achieves competitive mean Intersection of Union performance on PASCAL VOC 2012 (83.6% without MS-COCO pretraining), CamVid (76.4%) and Cityscapes (78.3%) datasets.



中文翻译:

跨注意括号形卷积网络的语义图像分割

由于与感知相关的应用在当今的工业生产和日常生活中非常重要,因此从语义上理解给定图像的解决方案受到文献的广泛关注。为此,由于在卷积神经网络中将全局上下文集成到局部细节中的新颖操作,针对此类像素级分割问题已经取得了显著成就。但是,现有策略中的这种策略并未充分利用中间层功能,这些中间层功能在细粒度信息与语义信息之间保持了合理的平衡。因此,本文介绍了一种跨注意括号形的卷积网络(CAB-Net),以利用它们对构造像素标记地图的比赛的贡献。具体来说,来自主干网络的细到粗特征图通过交叉方式有效地融合了通道方式和空间注意方案,即交叉注意融合,从而将语义丰富的特征嵌入更精细的模式中。这些新解码的输出会连续不断地重复相同的过程,直到形成具有最高分辨率的最终特征图以完整理解场景为止。因此,拟议的CAB-Net在PASCAL VOC 2012(未经MS-COCO预培训的情况下为83.6%),CamVid(76.4%)和Cityscapes(78.3%)数据集上实现了竞争性的工会绩效平均交集。这些新解码的输出会连续不断地重复相同的过程,直到形成具有最高分辨率的最终特征图以完整理解场景为止。因此,拟议的CAB-Net在PASCAL VOC 2012(未经MS-COCO预培训的情况下为83.6%),CamVid(76.4%)和Cityscapes(78.3%)数据集上实现了竞争性的工会绩效平均交集。这些新解码的输出会连续不断地重复相同的过程,直到形成具有最高分辨率的最终特征图以完整理解场景为止。因此,拟议的CAB-Net在PASCAL VOC 2012(未经MS-COCO预培训的情况下为83.6%),CamVid(76.4%)和Cityscapes(78.3%)数据集上实现了竞争性的工会绩效平均交集。

更新日期:2020-06-20
down
wechat
bug