Skip to main content
Log in

Discrete convolutional CRF networks for depth estimation from monocular infrared images

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Predicting the depth of a scene from monocular infrared images, which plays a crucial role in understanding three-dimensional structures, is one of the challenging tasks in machine learning and computer vision. Considering the lack of texture and color information in infrared images, a novel discrete convolutional conditional random field network is proposed for depth estimation. The proposed method inherits several merits of conditional random fields and deep learning. First, the pairwise features are automatically extracted and optimized through deep architectures. Second, the monocular-images-based depth regression is converted into a multi-class classification, in which the order information of different levels of depths is considered in the loss function. Our experiments demonstrate that this conversion achieves much higher accuracy and faster conversion. Third, to obtain fine-grained level details, we have further proposed a multi-scale discrete convolutional conditional random field network that computes the pairwise features of the discrete conditional random field at different spatial levels. Extensive experiments on the infrared image dataset NUSTMS demonstrate that the proposed method outperforms other depth estimation methods. Specifically, for the proposed method, the mean relative error is 0.181, the mean log10 error is 0.072, and the accuracy with a threshold (t = 1.253) is 95.3%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Eigen D, Puhrsch C, Fergus R et al (2014) Depth map prediction from a single image using a multi-scale deep network. Adv Neural Inf Process Syst 2014:2366–2374

    Google Scholar 

  2. Eigen D, Fergus R (2015) Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. Int Conf Comput Vision 2015:2650–2658

    Google Scholar 

  3. Martínez Torres J, Iglesias Comesaña C, García-Nieto PJ (2019) Review: machine learning techniques applied to cybersecurity. Int J Mach Learn Cybernet 10(10):2823–2836

    Article  Google Scholar 

  4. Saxena A, Chung SH, Ng AY (2005) Learning depth from single monocular images. Adv Neural Inf Process Syst 2005:1161–1168

    Google Scholar 

  5. Saxena A, Sun M, Ng AY (2009) Makethree-dimensional: learning three-dimensional scene structure from a single still image. IEEE Trans Pattern Anal Mach Intell 31(5):824–840

    Article  Google Scholar 

  6. Liu B, Gould S, Koller D (2010) Single image depth estimation from predicted semantic labels. Proc IEEE Conf Comput Vision Pattern Recognit 2010:1253–1260

    Google Scholar 

  7. Liu M, Salzmann M, He X (2014) Discrete-continuous depth estimation from a single image. Proc IEEE Conf Comput Vision Pattern Recognit 2014:716–723

    Google Scholar 

  8. Zheng S, Jayasumana S, Romera-Paredes B et al (2016) Conditional random fields as recurrent neural networks. IEEE Int Conf Comput Vision 2016:1529–1537

    Google Scholar 

  9. Zhao H, Shi J, Qi X et al (2017) Pyramid scene parsing network. Proc IEEE Conf Comput Vision Pattern Recognit 2017:2881–2890

    Google Scholar 

  10. Farabet C, Couprie C, Najman L et al (2013) Learning hierarchical features for scene labeling. IEEE Trans Pattern Anal Mach Intell 35(8):1915–1929

    Article  Google Scholar 

  11. Li NB, Shen NC, Dai NY et al (2015) Depth and surface normal estimation from monocular images using regression on deep features and hierarchical CRFs. Proc IEEE Conf Comput Vision Pattern Recognit 2015:1119–1127

    Google Scholar 

  12. Liu F, Shen C, Lin G (2015) Deep convolutional neural fields for depth estimation from a single image. Proc IEEE Conf Comput Vision Pattern Recognit 2015:5162–5170

    Google Scholar 

  13. Liu F, Shen C, Lin G et al (2015) Learning depth from single monocular images using deep convolutional neural fields. IEEE Trans Pattern Anal Mach Intell 38(10):2024–2039

    Article  Google Scholar 

  14. Cao Y, Wu Z, Shen C (2018) Estimating depth from monocular images as classification using deep fully convolutional residual networks. IEEE Trans Circuits Syst Video Technol 2018:3174–3182

    Article  Google Scholar 

  15. Ibarra-Castanedo C et al (2004) Infrared image processing and data analysis. Infrared Phys Technol 46(1):75–83

    Article  Google Scholar 

  16. Krähenbühl P, Koltun V (2011) Efficient inference in fully connected CRFs with Gaussian edge potentials. Adv Neural Inf Process Syst 24:109–117

    Google Scholar 

  17. Noh H, Hong S, Han B (2015) Learning deconvolution network for semantic segmentation. IEEE Int Conf Comput Vision 2015:1520–1528

    Google Scholar 

  18. Xu D, Ricci E, Ouyang W et al (2017) Multi-scale continuous CRFs as sequential deep networks for monocular depth estimation. Proc IEEE Conf Comput Vision Pattern Recognit 2017:161–169

    Google Scholar 

  19. Wu S, Zhao H, Sun S (2019) Depth estimation from infrared video using local-feature-flow neural network. Int J Mach Learn Cybernet 10(9):2563–2572

    Article  Google Scholar 

  20. Xu D, Ouyang W, Wang X, Sebe N (2018) Pad-net: Multitasks guided prediciton-and-distillation network for simultaneous depth estimation and scene parsing. arXiv preprint arXiv:1805.04409

  21. Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556

  22. Szegedy C, Liu W, Jia Y et al (2015) Going deeper with convolutions. Proc IEEE Conf Comput Vision Pattern Recognit 2015:1–9

    Google Scholar 

  23. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. Proc IEEE Conf Comput Vision Pattern Recognit 2016:770–778

    Google Scholar 

  24. Szegedy C, Ioffe S, Vanhoucke V et al (2017) Inception-v4, inception-ResNet and the impact of residual connections on learning. 31st AAAI conference on artificial intelligence 2017:4278–4284

  25. Gu TT, Zhao HT, Sun SY (2018) Depth estimation of infrared image based on pyramid residual neural networks. Infrared Technol 40(5)

  26. Godard C, Mac Aodha O, Brostow GJ (2017) Unsupervised monocular depth estimation with left-right consistency. Proc IEEE Conf Comput Vision Pattern Recognit 2017:6602–6611

    Google Scholar 

  27. Kundu JN, Uppala PK, Pahuja A et al (2018) AdaDepth: unsupervised content congruent adaptation for depth estimation. Proc IEEE Conf Comput Vision Pattern Recognit 2018:2656–2665

    Google Scholar 

  28. Pilzer A, Xu D, Puscas MM et al (2018) Unsupervised adversarial depth estimation using cycled generative networks. Int Conf Three-dimens Vision 2018:587–595

    Google Scholar 

  29. Kuznietsov Y, Stückler J, Leibe B (2017) Semi-supervised deep learning for monocular depth map prediction. Proc IEEE Conf Comput Vision Pattern Recognit 2017:2215–2223

    Google Scholar 

  30. Fu H, Gong M, Wang C et al (2018) Deep ordinal regression network for monocular depth estimation. Proc IEEE Conf Comput Vision Pattern Recognit 2018:2002–2011

    Google Scholar 

  31. Srivastava N, Hinton G, Krizhevsky A et al (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15:1929–1958

    MathSciNet  MATH  Google Scholar 

  32. Viswanathan R (1993) A note on distributed estimation and sufficiency. IEEE Trans Inf Theory 39(5):1765–1767

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haitao Zhao.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Q., Zhao, H., Hu, Z. et al. Discrete convolutional CRF networks for depth estimation from monocular infrared images. Int. J. Mach. Learn. & Cyber. 12, 187–200 (2021). https://doi.org/10.1007/s13042-020-01164-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-020-01164-w

Keywords

Navigation