当前位置: X-MOL 学术Vis. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A lightweight network with attention decoder for real-time semantic segmentation
The Visual Computer ( IF 3.5 ) Pub Date : 2021-05-07 , DOI: 10.1007/s00371-021-02115-4
Kang Wang , Jinfu Yang , Shuai Yuan , Mingai Li

As an important task in scene understanding, semantic segmentation requires a large amount of computation to achieve high performance. In recent years, with the rise of autonomous systems, it is crucial to make a trade-off in terms of accuracy and speed. In this paper, we propose a novel asymmetric encoder–decoder network structure to address this problem. In the encoder, we design a Separable Asymmetric Module, which combines depth-wise separable asymmetric convolution with dilated convolution to greatly reduce computation cost while maintaining accuracy. On the other hand, an attention mechanism is also used in the decoder to further improve segmentation performance. Experimental results on CityScapes and CamVid datasets show that the proposed method can achieve a better balance between segmentation precision and speed compared with state-of-the-art semantic segmentation methods. Specifically, our model obtains mean IoU of 72.5% and 66.3% on CityScapes and CamVid test dataset, respectively, with less than 1M parameters.



中文翻译:

具有注意力解码器的轻量级网络,用于实时语义分割

作为场景理解的重要任务,语义分割需要大量的计算才能实现高性能。近年来,随着自治系统的兴起,在准确性和速度方面进行权衡至关重要。在本文中,我们提出了一种新颖的非对称编码器-解码器网络结构来解决这个问题。在编码器中,我们设计了一个可分离的不对称模块,该模块将深度方向上的可分离的不对称卷积与膨胀的卷积相结合,从而在保持精度的同时大大降低了计算成本。另一方面,在解码器中还使用关注机制来进一步提高分割性能。在CityScapes和CamVid数据集上的实验结果表明,与最新的语义分割方法相比,该方法可以在分割精度和速度之间取得更好的平衡。具体而言,我们的模型在CityScapes和CamVid测试数据集上获得的平均IoU分别为72.5%和66.3%,且参数少于1M。

更新日期:2021-05-07
down
wechat
bug