Mixed spatial pyramid pooling for semantic segmentation,Applied Soft Computing

当前位置： X-MOL 学术 › Appl. Soft Comput. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Mixed spatial pyramid pooling for semantic segmentation
Applied Soft Computing ( IF 8.7 ) Pub Date : 2020-03-06 , DOI: 10.1016/j.asoc.2020.106209
Zhengyu Xia , Joohee Kim

Semantic segmentation is a challenging task as each pixel should be labeled accurately in the image. To improve the performance of semantic segmentation, some Fully Convolutional Network (FCN) based semantic segmentation methods adopt a spatial pyramid pooling structure to enrich contextual information. Others employ an encoder–decoder architecture to recover object details gradually. In this paper, we propose a semantic segmentation framework which combines the benefits of these approaches. Specifically, we propose a Mixed Spatial Pyramid Pooling (MSPP) module based on region-based average pooling and dilated convolution to obtain dense multi-level contextual priors. To further refine the details of objects more effectively, we also propose a Global-Attention Fusion (GAF) module to provide global context as guidance for low-level features. Our proposed method achieves mIoU of 84.1% on PASCAL VOC 2012 dataset and 80.4% on Cityscapes dataset without using any post-processing or additional datasets for pretrained model.

中文翻译：

混合空间金字塔池用于语义分割

语义分割是一项具有挑战性的任务，因为应该在图像中准确标记每个像素。为了提高语义分割的性能，一些基于完全卷积网络（FCN）的语义分割方法采用空间金字塔池结构来丰富上下文信息。其他人则采用编解码器架构来逐渐恢复对象细节。在本文中，我们提出了一种语义分割框架，该框架结合了这些方法的优点。具体来说，我们提出了一个基于区域平均池和扩张卷积的混合空间金字塔池（MSPP）模块，以获得密集的多级上下文先验。为了进一步有效地细化对象的细节，我们还提出了一个全球注意力融合（GAF）模块，以提供全局上下文作为低级功能的指南。

更新日期：2020-03-06

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>