Discrete convolutional CRF networks for depth estimation from monocular infrared images

Wang, Qianqian; Zhao, Haitao; Hu, Zhengwei; Chen, Yuru; Li, Yuqi

doi:10.1007/s13042-020-01164-w

Discrete convolutional CRF networks for depth estimation from monocular infrared images

Original Article
Published: 04 July 2020

Volume 12, pages 187–200, (2021)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Qianqian Wang¹,
Haitao Zhao¹,
Zhengwei Hu¹,
Yuru Chen¹ &
…
Yuqi Li¹

363 Accesses
1 Citation
Explore all metrics

Abstract

Predicting the depth of a scene from monocular infrared images, which plays a crucial role in understanding three-dimensional structures, is one of the challenging tasks in machine learning and computer vision. Considering the lack of texture and color information in infrared images, a novel discrete convolutional conditional random field network is proposed for depth estimation. The proposed method inherits several merits of conditional random fields and deep learning. First, the pairwise features are automatically extracted and optimized through deep architectures. Second, the monocular-images-based depth regression is converted into a multi-class classification, in which the order information of different levels of depths is considered in the loss function. Our experiments demonstrate that this conversion achieves much higher accuracy and faster conversion. Third, to obtain fine-grained level details, we have further proposed a multi-scale discrete convolutional conditional random field network that computes the pairwise features of the discrete conditional random field at different spatial levels. Extensive experiments on the infrared image dataset NUSTMS demonstrate that the proposed method outperforms other depth estimation methods. Specifically, for the proposed method, the mean relative error is 0.181, the mean log10 error is 0.072, and the accuracy with a threshold (t = 1.25³) is 95.3%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Depth Estimation of Computer Video Images Based on Deep Learning

Article 05 December 2019

Temporally Consistent Depth Map Prediction Using Deep Convolutional Neural Network and Spatial-Temporal Conditional Random Field

Article 12 May 2017

MDEConvFormer: estimating monocular depth as soft regression based on convolutional transformer

Article 27 January 2024

References

Eigen D, Puhrsch C, Fergus R et al (2014) Depth map prediction from a single image using a multi-scale deep network. Adv Neural Inf Process Syst 2014:2366–2374
Google Scholar
Eigen D, Fergus R (2015) Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. Int Conf Comput Vision 2015:2650–2658
Google Scholar
Martínez Torres J, Iglesias Comesaña C, García-Nieto PJ (2019) Review: machine learning techniques applied to cybersecurity. Int J Mach Learn Cybernet 10(10):2823–2836
Article Google Scholar
Saxena A, Chung SH, Ng AY (2005) Learning depth from single monocular images. Adv Neural Inf Process Syst 2005:1161–1168
Google Scholar
Saxena A, Sun M, Ng AY (2009) Makethree-dimensional: learning three-dimensional scene structure from a single still image. IEEE Trans Pattern Anal Mach Intell 31(5):824–840
Article Google Scholar
Liu B, Gould S, Koller D (2010) Single image depth estimation from predicted semantic labels. Proc IEEE Conf Comput Vision Pattern Recognit 2010:1253–1260
Google Scholar
Liu M, Salzmann M, He X (2014) Discrete-continuous depth estimation from a single image. Proc IEEE Conf Comput Vision Pattern Recognit 2014:716–723
Google Scholar
Zheng S, Jayasumana S, Romera-Paredes B et al (2016) Conditional random fields as recurrent neural networks. IEEE Int Conf Comput Vision 2016:1529–1537
Google Scholar
Zhao H, Shi J, Qi X et al (2017) Pyramid scene parsing network. Proc IEEE Conf Comput Vision Pattern Recognit 2017:2881–2890
Google Scholar
Farabet C, Couprie C, Najman L et al (2013) Learning hierarchical features for scene labeling. IEEE Trans Pattern Anal Mach Intell 35(8):1915–1929
Article Google Scholar
Li NB, Shen NC, Dai NY et al (2015) Depth and surface normal estimation from monocular images using regression on deep features and hierarchical CRFs. Proc IEEE Conf Comput Vision Pattern Recognit 2015:1119–1127
Google Scholar
Liu F, Shen C, Lin G (2015) Deep convolutional neural fields for depth estimation from a single image. Proc IEEE Conf Comput Vision Pattern Recognit 2015:5162–5170
Google Scholar
Liu F, Shen C, Lin G et al (2015) Learning depth from single monocular images using deep convolutional neural fields. IEEE Trans Pattern Anal Mach Intell 38(10):2024–2039
Article Google Scholar
Cao Y, Wu Z, Shen C (2018) Estimating depth from monocular images as classification using deep fully convolutional residual networks. IEEE Trans Circuits Syst Video Technol 2018:3174–3182
Article Google Scholar
Ibarra-Castanedo C et al (2004) Infrared image processing and data analysis. Infrared Phys Technol 46(1):75–83
Article Google Scholar
Krähenbühl P, Koltun V (2011) Efficient inference in fully connected CRFs with Gaussian edge potentials. Adv Neural Inf Process Syst 24:109–117
Google Scholar
Noh H, Hong S, Han B (2015) Learning deconvolution network for semantic segmentation. IEEE Int Conf Comput Vision 2015:1520–1528
Google Scholar
Xu D, Ricci E, Ouyang W et al (2017) Multi-scale continuous CRFs as sequential deep networks for monocular depth estimation. Proc IEEE Conf Comput Vision Pattern Recognit 2017:161–169
Google Scholar
Wu S, Zhao H, Sun S (2019) Depth estimation from infrared video using local-feature-flow neural network. Int J Mach Learn Cybernet 10(9):2563–2572
Article Google Scholar
Xu D, Ouyang W, Wang X, Sebe N (2018) Pad-net: Multitasks guided prediciton-and-distillation network for simultaneous depth estimation and scene parsing. arXiv preprint arXiv:1805.04409
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Szegedy C, Liu W, Jia Y et al (2015) Going deeper with convolutions. Proc IEEE Conf Comput Vision Pattern Recognit 2015:1–9
Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. Proc IEEE Conf Comput Vision Pattern Recognit 2016:770–778
Google Scholar
Szegedy C, Ioffe S, Vanhoucke V et al (2017) Inception-v4, inception-ResNet and the impact of residual connections on learning. 31st AAAI conference on artificial intelligence 2017:4278–4284
Gu TT, Zhao HT, Sun SY (2018) Depth estimation of infrared image based on pyramid residual neural networks. Infrared Technol 40(5)
Godard C, Mac Aodha O, Brostow GJ (2017) Unsupervised monocular depth estimation with left-right consistency. Proc IEEE Conf Comput Vision Pattern Recognit 2017:6602–6611
Google Scholar
Kundu JN, Uppala PK, Pahuja A et al (2018) AdaDepth: unsupervised content congruent adaptation for depth estimation. Proc IEEE Conf Comput Vision Pattern Recognit 2018:2656–2665
Google Scholar
Pilzer A, Xu D, Puscas MM et al (2018) Unsupervised adversarial depth estimation using cycled generative networks. Int Conf Three-dimens Vision 2018:587–595
Google Scholar
Kuznietsov Y, Stückler J, Leibe B (2017) Semi-supervised deep learning for monocular depth map prediction. Proc IEEE Conf Comput Vision Pattern Recognit 2017:2215–2223
Google Scholar
Fu H, Gong M, Wang C et al (2018) Deep ordinal regression network for monocular depth estimation. Proc IEEE Conf Comput Vision Pattern Recognit 2018:2002–2011
Google Scholar
Srivastava N, Hinton G, Krizhevsky A et al (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15:1929–1958
MathSciNet MATH Google Scholar
Viswanathan R (1993) A note on distributed estimation and sufficiency. IEEE Trans Inf Theory 39(5):1765–1767
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Information Science and Engineering, East China University of Science and Technology, Shanghai, China
Qianqian Wang, Haitao Zhao, Zhengwei Hu, Yuru Chen & Yuqi Li

Authors

Qianqian Wang
View author publications
You can also search for this author in PubMed Google Scholar
Haitao Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Zhengwei Hu
View author publications
You can also search for this author in PubMed Google Scholar
Yuru Chen
View author publications
You can also search for this author in PubMed Google Scholar
Yuqi Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Haitao Zhao.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, Q., Zhao, H., Hu, Z. et al. Discrete convolutional CRF networks for depth estimation from monocular infrared images. Int. J. Mach. Learn. & Cyber. 12, 187–200 (2021). https://doi.org/10.1007/s13042-020-01164-w

Download citation

Received: 03 February 2019
Accepted: 25 June 2020
Published: 04 July 2020
Issue Date: January 2021
DOI: https://doi.org/10.1007/s13042-020-01164-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Discrete convolutional CRF networks for depth estimation from monocular infrared images

Abstract

Access this article

Similar content being viewed by others

Depth Estimation of Computer Video Images Based on Deep Learning

Temporally Consistent Depth Map Prediction Using Deep Convolutional Neural Network and Spatial-Temporal Conditional Random Field

MDEConvFormer: estimating monocular depth as soft regression based on convolutional transformer

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Discrete convolutional CRF networks for depth estimation from monocular infrared images

Abstract

Access this article

Similar content being viewed by others

Depth Estimation of Computer Video Images Based on Deep Learning

Temporally Consistent Depth Map Prediction Using Deep Convolutional Neural Network and Spatial-Temporal Conditional Random Field

MDEConvFormer: estimating monocular depth as soft regression based on convolutional transformer

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation