Efficient Residual Neural Network for Semantic Segmentation

Li, Bin; Zang, Junyue; Cao, Jie

doi:10.1134/S1054661821020103

Efficient Residual Neural Network for Semantic Segmentation

MATHEMATICAL THEORY OF IMAGES AND SIGNALS REPRESENTING, PROCESSING, ANALYSIS, RECOGNITION AND UNDERSTANDING
Published: 30 June 2021

Volume 31, pages 212–220, (2021)
Cite this article

Pattern Recognition and Image Analysis Aims and scope Submit manuscript

Bin Li¹,
Junyue Zang¹ &
Jie Cao¹

178 Accesses
3 Citations
Explore all metrics

Abstract

In this paper, we present an improved Efficient Neural Network (ENet) for semantic segmentation, and named the proposed network as Efficient Residual Neural Network (ERNet). The ERNet network contains two processing streams: one is pooling stream, which is used to obtain high-dimensional semantic information; the other is residual stream which is used to record low-dimensional boundary information. The ERNet has five stages, each stage contains several bottleneck modules. The output of each bottleneck in the ERNet network is fed into the residual stream. Starting from the second stage of ERNet, pooling stream and residual stream through concatenating are used as inputs for each down-sampling or up-sampling bottleneck. The identity mapping of residual stream shortens the distance between the near input and output terminals of each stage network in ERNet, alleviates the problem of vanishing gradient, strengthens the propagation of low-dimensional boundary features, and encourages feature reuse of low-dimensional boundary features. We tested ERNet on CamVid, Cityscape, and SUN RGB-D datasets. The segmentation speed of ERNet is close to that of ENet, but the segmentation accuracy is higher than that of ENet.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Widening residual refine edge reserved neural network for semantic segmentation

Article 22 January 2019

Wen Su & Zengfu Wang

Recent progress in semantic image segmentation

Article Open access 27 June 2018

Xiaolong Liu, Zhidong Deng & Yuhan Yang

BENet: boundary-enhanced network for real-time semantic segmentation

Article 27 March 2024

Xiaochun Lei, Zeyu Chen, … Zetao Jiang

REFERENCES

J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” IEEE Trans. Pattern Anal. Mach. Intell. 39 (4), 640–651 (2014).
Google Scholar
V. Badrinarayanan, A. Handa, and R. Cipolla, “SegNet: A deep convolutional encoder-decoder architecture for image segmentation,” arXiv (2015). arXiv:1511.00561
M. Treml, J. Arjona-Medina, T. Unterthiner, R. Durgesh, F. Friedmann, P. Schuberth, A. Mayr, M. Heusel, M. Hofmarcher, M. Widrich, B. Nessler, and S. Hochreiter, “Speeding up semantic segmentation for autonomous driving,” in NIPS Workshop (2016).
Liang Chieh Chen et al., “DeepLab: Semantic image segmentation with deep convolutional nets, Atrous convolution, and fully connected CRFs,” IEEE Trans. Pattern Anal. Mach. Intell. 40 (4), 834–848 (2016).
Article Google Scholar
Golnaz Ghiasi and C. C. Fowlkes, “Laplacian pyramid reconstruction and refinement for semantic segmentation,” arXiv (2016). arXiv:1605.02264 [cs.CV]
Liang Chieh Chen et al., “Attention to scale: Scale-aware semantic image segmentation,” arXiv (2015). arXiv:1511.03339 [cs.CV]
B. Hariharan et al., “Hypercolumns for object segmentation and fine-grained localization,” in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015).
Fangting Xia et al., “Zoom better to see clearer: Human and object parsing with hierarchical auto-zoom net,” in ECCV 2016: Computer Vision–ECCV 2016 (2016), pp. 648–663.
Google Scholar
T. Pohlen et al., “Full-resolution residual networks for semantic segmentation in street scenes,” arXiv (2016). arXiv:1611.08323 [cs.CV]
V. Badrinarayanan, A. Handa, and R. Cipolla, “SegNet: A deep convolutional encoder-decoder architecture for robust semantic pixel-wise labelling,” arXiv (2015). arXiv:1505.07293
E. Romera, J. M. Alvarez, L. M. Bergasa, and R. Arroyo, “Efficient ConvNet for real-time semantic segmentation,” in 2017 IEEE Intelligent Vehicles Symposium (IV) (IEEE, 2017).
A. Paszke et al., “ENet: A deep neural network architecture for real-time semantic segmentation,” arXiv (2016). arXiv:1606.02147 [cs.CV]
G. J. Brostow, J. Fauqueur, and R. Cipolla, “Semantic object classes in video: A high-definition ground truth database,” Pattern Recognit. Lett. 30 (2), 88–97 (2009).
Article Google Scholar
M. Cordts et al., “The Cityscapes dataset for semantic urban scene understanding,” arXiv (2016). arXiv:1604.01685 [cs.CV]
S. Song, S. P. Lichtenberg, and J. Xiao, “Sun RGB-D: A RGB-D scene understanding benchmark suite,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015), pp. 567–576.
J. Tompson, R. Goroshin, A. Jain, Y. LeCun, and C. Bregler, “Efficient object localization using convolutional networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015), pp. 648–656.
S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” in International Conference on International Conference on Machine Learning (2015).
K. He, X. Zhang, S. Ren, and J. Sun, “Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification,” in ICCV’15: Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV) (2015), pp. 1026–1034.
Yu, Fisher and V. Koltun, “Multi-scale context aggregation by dilated convolutions,” arXiv (2015). arXiv:1511.07122
J. Jin, A. Dunder, and E. Culurciello, “Flattened convolutional neural networks for feedforward acceleration,” arXiv (2014). arXiv:1412.5474

Download references

Funding

This paper is supported by: Natural Science Foundation Project of science and Technology Department of Jilin Province under grant no. 20200201165JC.

Author information

Authors and Affiliations

School of Computer Science, Northeast Electric Power University, 132012, Jilin, China
Bin Li, Junyue Zang & Jie Cao

Authors

Bin Li
View author publications
You can also search for this author in PubMed Google Scholar
Junyue Zang
View author publications
You can also search for this author in PubMed Google Scholar
Jie Cao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bin Li.

Ethics declarations

COMPLIANCE WITH ETHICAL STANDARDS

This article does not contain any studies involving animals performed by any of the authors.

This article does not contain any studies involving human participants performed by any of the authors.

CONFLICT OF INTERESTS

The authors declare that there is no conflict of interests regarding the publication of this paper.

Additional information

Bin Li received his M.S. and PhD degrees from the School of Computer Science and Technology at Jilin University in China in 2011 and 2015, respectively. He is currently an Associate Professor with the School of Computer Science at Northeast Electric Power University. His research interests include image processing, computer vision, and pattern recognition.

Junyue Zang received her bachelor’s degree from the QINGDAO university of technology in 2017. She is currently a graduate student with the School of Computer Science at Northeast Electric Power University. Her research interests include computer vision, image processing, and deep learning.

Jie Cao received her Ph.D. degree from Computer Science and Technology, Jilin University, Changchun, China in 2017. Currently, she is an associate professor and also a master’s tutor in School of Computer Science, Northeast Electric Power University, Jilin. Her research interests include computer network, machine learning, and power grid stability and security.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, B., Zang, J. & Cao, J. Efficient Residual Neural Network for Semantic Segmentation. Pattern Recognit. Image Anal. 31, 212–220 (2021). https://doi.org/10.1134/S1054661821020103

Download citation

Received: 15 May 2020
Revised: 26 November 2020
Accepted: 30 November 2020
Published: 30 June 2021
Issue Date: April 2021
DOI: https://doi.org/10.1134/S1054661821020103

Keywords:

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions