MGSeg: Multiple Granularity-Based Real-Time Semantic Segmentation Network,IEEE Transactions on Image Processing

当前位置： X-MOL 学术 › IEEE Trans. Image Process. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

MGSeg: Multiple Granularity-Based Real-Time Semantic Segmentation Network
IEEE Transactions on Image Processing ( IF 10.8 ) Pub Date : 2021-08-10 , DOI: 10.1109/tip.2021.3102509
Jun-Yan He , Shi-Hua Liang , Xiao Wu , Bo Zhao , Lei Zhang

Recent works on semantic segmentation witness significant performance improvement by utilizing global contextual information. In this paper, an efficient multi-granularity based semantic segmentation network (MGSeg) is proposed for real-time semantic segmentation, by modeling the latent relevance between multi-scale geometric details and high-level semantics for fine granularity segmentation. In particular, a light-weight backbone ResNet-18 is first adopted to produce the hierarchical features. Hybrid Attention Feature Aggregation (HAFA) is designed to filter the noisy spatial details of features, acquire the scale-invariance representation, and alleviate the gradient vanishing problem of the early-stage feature learning. After aggregating the learned features, Fine Granularity Refinement (FGR) module is employed to explicitly model the relationship between the multi-level features and categories, generating proper weights for fusion. More importantly, to meet the real-time processing, a series of light-weight strategies and simplified structures are applied to accelerate the efficiency, including light-weight backbone, channel compression, narrow neck structure, and so on. Extensive experiments conducted on benchmark datasets Cityscapes and CamVid demonstrate that the proposed method achieves the state-of-the-art performance, 77.8%@50fps and 72.7%@127fps on Cityscapes and CamVid datasets, respectively, having the capability for real-time applications.

中文翻译：

MGSeg：基于多粒度的实时语义分割网络

最近的语义分割工作通过利用全局上下文信息见证了性能的显着提高。本文提出了一种高效的基于多粒度的语义分割网络（MGSeg），通过建模多尺度几何细节和高级语义之间的潜在相关性进行细粒度分割，用于实时语义分割。特别是，首先采用轻量级主干 ResNet-18 来产生分层特征。混合注意力特征聚合（HAFA）旨在过滤特征的噪声空间细节，获得尺度不变性表示，并缓解早期特征学习的梯度消失问题。聚合学习到的特征后，采用细粒度细化（FGR）模块对多级特征和类别之间的关系进行显式建模，生成适当的融合权重。更重要的是，为了满足实时处理，采用了一系列轻量级策略和简化结构来加速效率，包括轻量级主干、通道压缩、窄颈结构等。在基准数据集 Cityscapes 和 CamVid 上进行的大量实验表明，所提出的方法在 Cityscapes 和 CamVid 数据集上实现了最先进的性能，分别为 77.8%@50fps 和 72.7%@127fps，具有实时应用的能力。

更新日期：2021-08-10

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11