Decoupling semantic and localization for semantic segmentation via magnitude-aware and phase-sensitive learning,Information Fusion

当前位置： X-MOL 学术 › Inform. Fusion › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Decoupling semantic and localization for semantic segmentation via magnitude-aware and phase-sensitive learning
Information Fusion ( IF 18.6 ) Pub Date : 2024-02-21 , DOI: 10.1016/j.inffus.2024.102314
Qingqing Yan , Shu Li , Zongtao He , Xun Zhou , Mengxian Hu , Chengju Liu , Qijun Chen

Semantic segmentation requires the simultaneous generation of strong semantic and precise localization segmentation results. However, their inherent paradox drives most existing methods to perform trade-offs or overcompensation between high-level semantics and fine localization during resolution reconstruction, which may lead to limited performance or enormous computation costs. To this end, inspired by the frequency model of natural images, we propose a new encoder–decoder-based segmentation architecture, namely MPLSeg, from a novel perspective: semantic-localization decoupled representation via magnitude-aware and phase-sensitive learning. Specifically, we first investigate and reveal the symmetric inverse inherent properties of image magnitude and phase in semantics and localization. Then, building upon that, we construct a concise adaptive frequency-aware module (AFM) to alleviate the semantic gap and spatial location misalignment during multi-level feature fusion. The core of AFM comprises a magnitude perceptron (MP) equipped with the dynamic magnitude weighting mechanism and a phase amender (PA) designed with a spectral residual mapping for keeping sensitive to salient frequency combinations and off-norm localization features, respectively. Finally, we tailor a phase-sensitive loss (PSL) as an auxiliary supervision for semantic-independent proto-localization learning. The PSL ensures multi-level feature diversity and enhances fine-grained resolution reconstruction. Extensive experimental results demonstrate the effectiveness and superiority of MPLSeg and its components. Without any fancy tricks, MPLSeg exhibits the state-of-the-art performance on three challenging semantic segmentation benchmarks. The code is available at .

中文翻译：

通过幅度感知和相敏学习解耦语义分割的语义和定位

语义分割需要同时生成强语义和精确的定位分割结果。然而，它们固有的悖论促使大多数现有方法在分辨率重建过程中在高级语义和精细定位之间进行权衡或过度补偿，这可能导致性能有限或计算成本巨大。为此，受自然图像频率模型的启发，我们从一个新颖的角度提出了一种新的基于编码器-解码器的分割架构，即 MPLSeg：通过幅度感知和相敏学习进行语义定位解耦表示。具体来说，我们首先研究并揭示图像幅度和相位在语义和定位中的对称逆固有属性。然后，在此基础上，我们构建了一个简洁的自适应频率感知模块（AFM），以减轻多级特征融合过程中的语义差距和空间位置错位。 AFM 的核心包括配备动态幅度加权机制的幅度感知器（MP）和设计有频谱残差映射的相位修正器（PA），分别用于保持对显着频率组合和非规范定位特征的敏感。最后，我们定制了一个相敏损失（PSL）作为语义无关的原始定位学习的辅助监督。 PSL确保多级特征多样性并增强细粒度分辨率重建。大量的实验结果证明了 MPLSeg 及其组件的有效性和优越性。没有任何花哨的技巧，MPLSeg 在三个具有挑战性的语义分割基准测试中展示了最先进的性能。该代码可在处获取。

更新日期：2024-02-21

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>