CT-UNet: Context-Transfer-UNet for Building Segmentation in Remote Sensing Images

Liu, Sheng; Ye, Huanran; Jin, Kun; Cheng, Haohao

doi:10.1007/s11063-021-10592-w

CT-UNet: Context-Transfer-UNet for Building Segmentation in Remote Sensing Images

Published: 02 August 2021

Volume 53, pages 4257–4277, (2021)
Cite this article

Neural Processing Letters Aims and scope Submit manuscript

Sheng Liu¹,
Huanran Ye¹,
Kun Jin¹ &
…
Haohao Cheng¹

1322 Accesses
16 Citations
1 Altmetric
Explore all metrics

Abstract

With the proliferation of remote sensing images, how to segment buildings more accurately in remote sensing images is a critical challenge. First, most networks have poor recognition ability on high resolution images, resulting in blurred boundaries in the segmented building maps. Second, the similarity between buildings and background results in intra-class inconsistency. To address these two problems, we propose an UNet-based network named Context-Transfer-UNet (CT-UNet). Specifically, we design Dense Boundary Block. Dense Block utilizes reuse mechanism to refine features and increase recognition capabilities. Boundary Block introduces the low-level spatial information to solve the fuzzy boundary problem. Then, to handle intra-class inconsistency, we construct Spatial Channel Attention Block. It combines context space information and selects more distinguishable features from space and channel. Finally, we propose an improved loss function to enhance the purpose of loss by adding evaluation indicator. Based on our proposed CT-UNet, we achieve 85.33% mean IoU on the Inria dataset, 91.00% mean IoU on the WHU dataset and 83.92% F1-score on the Massachusetts dataset. The results outperform our baseline (U-Net ResNet-34) by 3.76%, exceed Web-Net by 2.24% and surpass HFSA-Unet by 2.17%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hybrid first and second order attention Unet for building segmentation in remote sensing images

Article 09 March 2020

Nanjun He, Leyuan Fang & Antonio Plaza

Efficiency analysis of ITN loss function for deep semantic building segmentation

Article 09 March 2024

Mohammad Erfan Omati & Fatemeh Tabib Mahmoudi

Boundary-guided DCNN for building extraction from high-resolution remote sensing images

Article 07 May 2022

Sihan Yang, Qiang He, … Gwanggil Jeon

References

Adiba A, Hajji H, Maatouk M (2019) Transfer learning and u-net for buildings segmentation. In: Proceedings of the new challenges in data sciences: acts of the second conference of the Moroccan classification society, ACM, p 14
Aptoula E (2013) Remote sensing image retrieval with global morphological texture descriptors. IEEE Trans Geosci Remote Sens 52(5):3023–3034
Article Google Scholar
Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: A deep convolutional encoder–decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495
Article Google Scholar
Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. https://arxiv.org/abs/1409.0473
Bischke B, Helber P, Folz J, Borth D, Dengel A (2019) Multi-task learning for segmentation of building footprints with deep neural networks. In: 2019 IEEE international conference on image processing (ICIP), IEEE, pp 1480–1484
Fourure D, Emonet R, Fromont E, Muselet D, Tremeau A, Wolf C (2017) Residual conv-deconv grid network for semantic segmentation. https://arxiv.org/abs/1707.07958
Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: Proceedings of the fourteenth international conference on artificial intelligence and statistics, pp 315–323
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
He N, Fang L, Plaza A (2020) Hybrid first and second order attention unet for building segmentation in remote sensing images. Inf Sci 63(140305):1–140305
Article Google Scholar
Hu S, Ning Q, Chen B, Lei Y, Zhou X, Yan H, Zhao C, Tang T, Hu R (2020) Segmentation of aerial image with multi-scale feature and attention model. In: Artificial Intelligence in China, Springer, pp 58–66
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708
Ji S, Wei S, Lu M (2018) Fully convolutional networks for multisource building extraction from an open aerial and satellite imagery data set. IEEE Tran Geosci Remote Sens 57(1):574–586
Article Google Scholar
Khalel A, El-Saban M (2018) Automatic pixelwise object labeling for aerial imagery using stacked u-nets. https://arxiv.org/abs/1803.04953
Kim JH, Lee H, Hong SJ, Kim S, Park J, Hwang JY, Choi JP (2018) Objects segmentation from high-resolution aerial images using u-net with pyramid pooling layers. IEEE Geosci Remote Sens Lett 16(1):115–119
Article Google Scholar
Liu Y, Gross L, Li Z, Li X, Fan X, Qi W (2019) Automatic building extraction on high-resolution remote sensing imagery using deep convolutional encoder-decoder with spatial pyramid pooling. IEEE Access 7:128774–128786
Article Google Scholar
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440
Luong MT, Pham H, Manning CD (2015) Effective approaches to attention-based neural machine translation. https://arxiv.org/abs/1508.04025
Maggiori E, Tarabalka Y, Charpiat G, Alliez P (2017) Can semantic labeling methods generalize to any city? the inria aerial image labeling benchmark. In: 2017 IEEE international geoscience and remote sensing symposium (IGARSS), IEEE, pp 3226–3229
Mitra P, Shankar BU, Pal SK (2004) Segmentation of multispectral remote sensing images using active support vector machines. Pattern Recogn Lett 25(9):1067–1074
Article Google Scholar
Mnih V (2013) Machine learning for aerial image labeling. Citeseer
Oktay O, Schlemper J, Folgoc LL, Lee M, Heinrich M, Misawa K, Mori K, McDonagh S, Hammerla NY, Kainz B, et al. (2018) Attention u-net: Learning where to look for the pancreas. https://arxiv.org/abs/1804.03999
Pal M (2005) Random forest classifier for remote sensing classification. Int J Remote Sens 26(1):217–222
Article Google Scholar
Pan X, Yang F, Gao L, Chen Z, Zhang B, Fan H, Ren J (2019) Building extraction from high-resolution aerial imagery using a generative adversarial network with spatial and channel attention mechanisms. Remote Sens 11(8):917
Article Google Scholar
Qi HN, Yang JG, Zhong YW, Deng C (2004) Multi-class svm based remote sensing image classification and its semi-supervised improvement scheme. In: Proceedings of 2004 international conference on machine learning and cybernetics (IEEE Cat. No. 04EX826), IEEE, vol 5, pp 3146–3151
Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention, Springer, pp 234–241
Sebastian C, Imbriaco R, Bondarev E, de With PH (2020) Adversarial loss for semantic segmentation of aerial imagery. https://arxiv.org/abs/2001.04269
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. https://arxiv.org/abs/1409.1556
Singh PP, Garg R (2013) Automatic road extraction from high resolution satellite image using adaptive global thresholding and morphological operations. J Ind Soc Remote Sens 41(3):631–640
Article Google Scholar
Song L, Xu Y, Zhang L, Du B, Zhang Q, Wang X (2020) Learning from synthetic images via active pseudo-labeling. IEEE Transactions on Image Processing
Tuermer S, Kurz F, Reinartz P, Stilla U (2013) Airborne vehicle detection in dense urban areas using hog features and disparity maps. IEEE J Select Top Appl Earth Observ Remote Sens 6(6):2327–2337
Article Google Scholar
Xia J, Du P, He X, Chanussot J (2013) Hyperspectral remote sensing image classification based on rotation forest. IEEE Geosci Remote Sens Lett 11(1):239–243
Article Google Scholar
Xu K, Ba J, Kiros R, Cho K, Courville A, Salakhudinov R, Zemel R, Bengio Y (2015) Show, attend and tell: Neural image caption generation with visual attention. In: International conference on machine learning, pp 2048–2057
Yi Y, Zhang Z, Zhang W, Zhang C, Li W, Zhao T (2019) Semantic segmentation of urban buildings from VHR remote sensing imagery using a deep convolutional neural network. Remote Sens 11(15):1774
Article Google Scholar
Yu C, Wang J, Peng C, Gao C, Yu G, Sang N (2018) Learning a discriminative feature network for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1857–1866
Zhang L, Zhang L, Tao D, Huang X (2011) On combining multiple features for hyperspectral remote sensing image classification. IEEE Trans Geosci Remote Sens 50(3):879–893
Article Google Scholar
Zhang Y, Gong W, Sun J (1897) Li W (2019) Web-net: A novel nest networks with ultra-hierarchical sampling for building extraction from aerial imageries. Remote Sens 11(16)
Zhang Z, Liu Q, Wang Y (2018) Road extraction by deep residual u-net. IEEE Geosci Remote Sens Lett 15(5):749–753
Article Google Scholar
Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2881–2890
Zhao H, Zhang Y, Liu S, Shi J, Change Loy C, Lin D, Jia J (2018) Psanet: Point-wise spatial attention network for scene parsing. In: Proceedings of the European conference on computer vision (ECCV), pp 267–283
Zhou Z, Siddiquee MMR, Tajbakhsh N, Liang J (2018) Unet++: A nested u-net architecture for medical image segmentation. In: Deep learning in medical image analysis and multimodal learning for clinical decision support, Springer, pp 3–11

Download references

Acknowledgements

This work was supported by the National Key R&D Program of China(No.2018YFB1305200) and Science Technology Department of Zhejiang Province(No.LGG19F020010). An earlier version of this paper was accepted at the Conference on International Conference on Pattern Recognition

Author information

Authors and Affiliations

Institution of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, 310023, Zhejiang, China
Sheng Liu, Huanran Ye, Kun Jin & Haohao Cheng

Authors

Sheng Liu
View author publications
You can also search for this author in PubMed Google Scholar
Huanran Ye
View author publications
You can also search for this author in PubMed Google Scholar
Kun Jin
View author publications
You can also search for this author in PubMed Google Scholar
Haohao Cheng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sheng Liu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, S., Ye, H., Jin, K. et al. CT-UNet: Context-Transfer-UNet for Building Segmentation in Remote Sensing Images. Neural Process Lett 53, 4257–4277 (2021). https://doi.org/10.1007/s11063-021-10592-w

Download citation

Accepted: 12 July 2021
Published: 02 August 2021
Issue Date: December 2021
DOI: https://doi.org/10.1007/s11063-021-10592-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

CT-UNet: Context-Transfer-UNet for Building Segmentation in Remote Sensing Images

Abstract

Access this article

Similar content being viewed by others

Hybrid first and second order attention Unet for building segmentation in remote sensing images

Efficiency analysis of ITN loss function for deep semantic building segmentation

Boundary-guided DCNN for building extraction from high-resolution remote sensing images

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

CT-UNet: Context-Transfer-UNet for Building Segmentation in Remote Sensing Images

Abstract

Access this article

Similar content being viewed by others

Hybrid first and second order attention Unet for building segmentation in remote sensing images

Efficiency analysis of ITN loss function for deep semantic building segmentation

Boundary-guided DCNN for building extraction from high-resolution remote sensing images

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation