当前位置: X-MOL 学术Appl. Soft Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Mixed spatial pyramid pooling for semantic segmentation
Applied Soft Computing ( IF 8.7 ) Pub Date : 2020-03-06 , DOI: 10.1016/j.asoc.2020.106209
Zhengyu Xia , Joohee Kim

Semantic segmentation is a challenging task as each pixel should be labeled accurately in the image. To improve the performance of semantic segmentation, some Fully Convolutional Network (FCN) based semantic segmentation methods adopt a spatial pyramid pooling structure to enrich contextual information. Others employ an encoder–decoder architecture to recover object details gradually. In this paper, we propose a semantic segmentation framework which combines the benefits of these approaches. Specifically, we propose a Mixed Spatial Pyramid Pooling (MSPP) module based on region-based average pooling and dilated convolution to obtain dense multi-level contextual priors. To further refine the details of objects more effectively, we also propose a Global-Attention Fusion (GAF) module to provide global context as guidance for low-level features. Our proposed method achieves mIoU of 84.1% on PASCAL VOC 2012 dataset and 80.4% on Cityscapes dataset without using any post-processing or additional datasets for pretrained model.



中文翻译:

混合空间金字塔池用于语义分割

语义分割是一项具有挑战性的任务,因为应该在图像中准确标记每个像素。为了提高语义分割的性能,一些基于完全卷积网络(FCN)的语义分割方法采用空间金字塔池结构来丰富上下文信息。其他人则采用编解码器架构来逐渐恢复对象细节。在本文中,我们提出了一种语义分割框架,该框架结合了这些方法的优点。具体来说,我们提出了一个基于区域平均池和扩张卷积的混合空间金字塔池(MSPP)模块,以获得密集的多级上下文先验。为了进一步有效地细化对象的细节,我们还提出了一个全球注意力融合(GAF)模块,以提供全局上下文作为低级功能的指南。

更新日期:2020-03-06
down
wechat
bug