EDPNet: An Encoding–Decoding Network with Pyramidal Representation for Semantic Image Segmentation,Sensors

当前位置： X-MOL 学术 › Sensors › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

EDPNet: An Encoding–Decoding Network with Pyramidal Representation for Semantic Image Segmentation
Sensors ( IF 3.9 ) Pub Date : 2023-03-17 , DOI: 10.3390/s23063205
Dong Chen ₁ , Xianghong Li ₁ , Fan Hu ₁ , P Takis Mathiopoulos ₂ , Shaoning Di ₃ , Mingming Sui ₁ , Jiju Peethambaran ₄

Affiliation

This paper proposes an encoding–decoding network with a pyramidal representation module, which will be referred to as EDPNet, and is designed for efficient semantic image segmentation. On the one hand, during the encoding process of the proposed EDPNet, the enhancement of the Xception network, i.e., Xception+ is employed as a backbone to learn the discriminative feature maps. The obtained discriminative features are then fed into the pyramidal representation module, from which the context-augmented features are learned and optimized by leveraging a multi-level feature representation and aggregation process. On the other hand, during the image restoration decoding process, the encoded semantic-rich features are progressively recovered with the assistance of a simplified skip connection mechanism, which performs channel concatenation between high-level encoded features with rich semantic information and low-level features with spatial detail information. The proposed hybrid representation employing the proposed encoding–decoding and pyramidal structures has a global-aware perception and captures fine-grained contours of various geographical objects very well with high computational efficiency. The performance of the proposed EDPNet has been compared against PSPNet, DeepLabv3, and U-Net, employing four benchmark datasets, namely eTRIMS, Cityscapes, PASCAL VOC2012, and CamVid. EDPNet acquired the highest accuracy of 83.6% and 73.8% mIoUs on eTRIMS and PASCAL VOC2012 datasets, while its accuracy on the other two datasets was comparable to that of PSPNet, DeepLabv3, and U-Net models. EDPNet achieved the highest efficiency among the compared models on all datasets.

中文翻译：

EDPNet：一种用于语义图像分割的具有金字塔表示的编码-解码网络

本文提出了一种具有金字塔表示模块的编码解码网络，简称为 EDPNet，专为高效语义图像分割而设计。一方面，在所提出的 EDPNet 的编码过程中，Xception 网络的增强，即 Xception+ 被用作主干来学习判别特征图。然后将获得的判别特征输入金字塔表示模块，通过利用多级特征表示和聚合过程从中学习和优化上下文增强特征。另一方面，在图像恢复解码过程中，编码的语义丰富的特征在简化的跳过连接机制的帮助下逐步恢复，它在具有丰富语义信息的高级编码特征和具有空间细节信息的低级特征之间执行通道连接。所提出的混合表示采用所提出的编码解码和金字塔结构，具有全局感知能力，可以很好地捕获各种地理对象的细粒度轮廓，计算效率高。已将拟议的 EDPNet 的性能与 PSPNet、DeepLabv3 和 U-Net 进行比较，使用四个基准数据集，即 eTRIMS、Cityscapes、PASCAL VOC2012 和 CamVid。EDPNet 在 eTRIMS 和 PASCAL VOC2012 数据集上获得了 83.6% 和 73.8% mIoUs 的最高准确率，而在其他两个数据集上的准确率与 PSPNet、DeepLabv3 和 U-Net 模型相当。

更新日期：2023-03-20

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>