Skip to main content
Log in

Enhanced machine perception by a scalable fusion of RGB–NIR image pairs in diverse exposure environments

  • Original Paper
  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract

A multi-spectral imaging technique for the swift fusion of red–green–blue (RGB) and near infrared (NIR) image pairs with a deep learning based resolution enhancement technique is proposed, mpirically investigated and compared to some state-of-the-art techniques in the current work. The results of the proposed multi-spectral image fusion demonstrate good chrominance preservation, improved sharpness and optimised lighting in low-light dawn and dusk scenes. The fused image shows the culmination of the edges that are inherent to both the RGB and NIR spectrum images. Some examples include increased visibility between vegetation and the sky, shadowed and non-shaded areas, and increased optical depth in tree branches and vehicles. A hue, saturation, value (HSV)–NIR fusion is also evaluated by simply converting the RGB image to the HSV colour space. HSV, due to its high colour strength, illuminates high-colour contrast artefacts such as road signs and the rear of vehicles better than their RGB-based fused image equivalent. Empirical research shows that RGB–NIR fusion outperforms other strategies in contrast restoration metric (r), two image quality assessment metrics, and a peak-to-noise-ratio metric. The two image fusion models are implemented in a deep learning semantic segmentation network to investigate their perceived consistency in real-world scenarios. The proposed coarse-grained semantic segmentation network is trained to auto-annotate pixels as belonging to one of the 10 classes. The per-class performance of the RGB–NIR and HSV–NIR-based semantic segmentation in comparison with other methods is discussed in detail in the current work.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Abbreviations

BRISQUE:

Blind/Referenceless Image Spatial Quality Evaluator

CNN:

Convolutional neural network

CRF:

Conditional random field

DCNN:

Deep convolutional neural network

DWT:

Discrete wavelet transform

FAAGKFCM:

Fast and automatically adjustable GRBF kernel-based FCM

FN:

False negative

FP:

False positive

GPU:

Graphics processing unit

IDWT:

Inverse discrete wavelet transform

IOU:

Intersection of union

IQA:

Image quality assessment

ILSVRC:

ImageNet Large Scale Visual Recognition Challenge

MEITY:

Ministry of Electronics and Information Technology

NIR:

Near infrared

NIQE:

Naturalness Image Quality Evaluator

PSNR:

Peak signal to noise ratio

RANUS:

RGB and NIR urban scene dataset

RGB:

Red–green–blue

SGDM:

Stochastic gradient descent with momentum

SIFT:

Scale invariant feature transform

SISR:

Single image super resolution

SSIM:

Structural Similarity Index

TN:

True negative

TP:

True positive

UAV:

Unmanned aerial vehicle

VDSR:

Very deep super resolution

VGG:

Visual Geometry Group

References

  1. Salamati, N., Larius, D., Csurka, G., Susstrunk, S.: Incorporating near-infrared information into semantic image segmentation. In: Proceedings of the European Conference on Computer Vision, pp. 461–471 (2012)

  2. Salamati, N., Fredembach, C., Süsstrunk, S.: Material classification using color and NIR images. In: Final Program and Proceedings—IS and T/SID Color Imaging Conference (2009)

  3. Salamati, N., Larlus, D., Csurka, G., Süsstrunk, S.: Semantic image segmentation using visible and near-infrared channels. In: Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2012)

  4. Morris, N.J.W., Avidan, S., Matusik, W., Pfister, H.: Statistics of infrared images. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2007)

  5. Zhou, W., Huang, G., Troy, A., Cadenasso, M.L.: Object-based land cover classification of shaded areas in high spatial resolution imagery of urban areas: a comparison study. Remote Sens. Environ. (2009). https://doi.org/10.1016/j.rse.2009.04.007

    Article  Google Scholar 

  6. Walter, V.: Object-based classification of remote sensing data for change detection. ISPRS J. Photogramm. Remote Sens. (2004). https://doi.org/10.1016/j.isprsjprs.2003.09.007

    Article  Google Scholar 

  7. Kong, S.G., Heo, J., Abidi, B.R., Paik, J., Abidi, M.A.: Recent advances in visual and infrared face recognition—a review. Comput. Vis. Image Underst. (2005). https://doi.org/10.1016/j.cviu.2004.04.001

    Article  Google Scholar 

  8. Shwartz, S., Namer, E., Schechner, Y.Y.: Blind haze separation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2006)

  9. Feng, C., Zhuo, S., Zhang, X., Shen, L., Süsstrunk, S.: Near-infrared guided color image dehazing. In: 2013 IEEE International Conference on Image Processing, ICIP 2013—Proceedings (2013)

  10. Zhang, X., Sim, T., Miao, X.: Enhancing photographs with near infrared images. In: 26th IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2008)

  11. Schaul, L., Fredembach, C., Süsstrunk, S.: Color image dehazing using the near-infrared. In: Proceedings—International Conference on Image Processing, ICIP (2009)

  12. Li, Z., Tan, P., Tan, R.T., Zou, D., Zhou, S.Z., Cheong, L.F.: Simultaneous video defogging and stereo reconstruction. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2015)

  13. Meng, G., Wang, Y., Duan, J., Xiang, S., Pan, C.: Efficient image dehazing with boundary constraint and contextual regularization. In: Proceedings of the IEEE International Conference on Computer Vision (2013)

  14. Tang, K., Yang, J., Wang, J.: Investigating haze-relevant features in a learning framework for image dehazing. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2014)

  15. Jang, D.W., Park, R.H.: Colour image dehazing using near-infrared fusion. IET Image Process. (2017). https://doi.org/10.1049/iet-ipr.2017.0192

    Article  Google Scholar 

  16. Ancuti, C.O., Ancuti, C.: Single image dehazing by multi-scale fusion. IEEE Trans. Image Process. (2013). https://doi.org/10.1109/TIP.2013.2262284

    Article  MATH  Google Scholar 

  17. Kudo, Y., Kubota, A.: Image dehazing method by fusing weighted near-infrared image. In: 2018 International Workshop on Advanced Image Technology, IWAIT 2018 (2018)

  18. Sappa, A.D., Carvajal, J.A., Aguilera, C.A., Oliveira, M., Romero, D., Vintimilla, B.X.: Wavelet-based visible and infrared image fusion: a comparative study. Sensors (Switzerland) (2016). https://doi.org/10.3390/s16060861

    Article  Google Scholar 

  19. Varjo, S., Hannuksela, J., Alenius, S.: Comparison of near infrared and visible image fusion methods. In: Proceedings of International Workshop on Applications, Systems and Services for Camera Phone Sensing (2011)

  20. Li, J., Song, M., Peng, Y.: Infrared and visible image fusion based on robust principal component analysis and compressed sensing. Infrared Phys. Technol. (2018). https://doi.org/10.1016/j.infrared.2018.01.003

    Article  Google Scholar 

  21. Scharwachter, T., Franke, U.: Low-level fusion of color, texture and depth for robust road scene understanding. In: IEEE Intelligent Vehicles Symposium, Proceedings (2015)

  22. Sturgess, P., Alahari, K., Ladický, L., Torr, P.H.S.: Combining appearance and structure from motion features for road scene understanding. In: British Machine Vision Conference, BMVC 2009—Proceedings (2009)

  23. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. (2015). https://doi.org/10.1007/s11263-015-0816-y

    Article  MathSciNet  Google Scholar 

  24. Everingham, M., Eslami, S.M.A., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The Pascal visual object classes challenge: a retrospective. Int. J. Comput. Vis. (2014). https://doi.org/10.1007/s11263-014-0733-5

    Article  Google Scholar 

  25. Lin, T.-Y., Maire, M., Belongie, S., Bourdev, L., Girshick, R., Hays, J., Perona, P., Ramanan, D., Zitnick, C.L., Dollár, P.: Microsoft COCO: common objects in context. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2015)

  26. Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: MM 2014—Proceedings of the 2014 ACM Conference on Multimedia (2014)

  27. Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., Ghemawat, S.: Tensorflow: Large-scale machine learning on heterogeneous distributed systems (2016). arXiv preprint arXiv:1603.04467

  28. Collobert, R., Kavukcuoglu, K., Farabet, C.: Torch7: A matlab-like environment for machine learning. In: BigLearn, NIPS workshop (No. CONF) (2011)

  29. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Communications of the ACM, vol. 60, no. 6, pp. 84–90 (2017)

  30. Arbeláez, P., Pont-Tuset, J., Barron, J., Marques, F., Malik, J.: Multiscale combinatorial grouping. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2014)

  31. Hariharan, B., ArbelÃiez, P., Girshick, R., Malik, J.: Simultaneous detection and segmentation. In: European Conference on Computer Vision, pp. 297–312. Springer, Cham (2014)

  32. Fulkerson, B., Vedaldi, A., Soatto, S.: Class segmentation and object localization with superpixel neighborhoods. In: 2009 IEEE 12th international conference on computer vision, pp. 670–677. IEEE (2009)

  33. Krähenbühl, P., Koltun, V.: Efficient inference in fully connected CRFs with Gaussian edge potentials. In: Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011 (2011)

  34. Farabet, C., Couprie, C., Najman, L., Lecun, Y.: Learning hierarchical features for scene labeling. IEEE Trans. Pattern Anal. Mach. Intell. (2013). https://doi.org/10.1109/TPAMI.2012.231

    Article  Google Scholar 

  35. Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Semantic image segmentation with deep convolutional nets and fully connected CRFs. In: 3rd International Conference on Learning Representations, ICLR 2015—Conference Track Proceedings (2015)

  36. Eigen, D., Fergus, R.: Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In: Proceedings of the IEEE International Conference on Computer Vision (2015)

  37. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2016)

  38. Salamati, N., Süsstrunk, S.: Material-based object segmentation using near-infrared information. In: Final Program and Proceedings—IS and T/SID Color Imaging Conference (2010)

  39. Choe, G., Kim, S.H., Im, S., Lee, J.Y., Narasimhan, S.G., Kweon, I.S.: RANUS: RGB and NIR urban scene dataset for deep scene parsing. IEEE Robot. Autom. Lett. (2018). https://doi.org/10.1109/LRA.2018.2801390

    Article  Google Scholar 

  40. Nongmeikapam, K., Kumar, W.K., Singh, A.D.: Fast and automatically adjustable GRBF kernel based fuzzy C-means for cluster-wise coloured feature extraction and segmentation of MR images. IET Image Process. (2018). https://doi.org/10.1049/iet-ipr.2017.1102

  41. Kim, J., Lee, J.K., Lee, K.M.: Accurate image super-resolution using very deep convolutional networks. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2016)

  42. Henning, M., Thomas, D., others: The IAPR benchmark: a new evaluation resource for visual information systems. In: Proceedings of the International Conference on Language Resources and Evaluation (LREC) (2006)

  43. He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: Proceedings of the IEEE International Conference on Computer Vision (2015)

  44. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2015)

  45. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: 3rd International Conference on Learning Representations, ICLR 2015—Conference Track Proceedings (2015)

  46. Höft, N., Schulz, H., Behnke, S.: Fast semantic segmentation of RGB-D scenes with GPU-accelerated deep neural networks. In: Lecture Notes in Computer Science (LNCS) (Including Subseries Lecture Notes in Artificial Intelligence (LNAI) and Lecture Notes in Bioinformatics) (2014). https://doi.org/10.1007/978-3-319-11206-0_9

  47. Socher, R., Lin, C.C.Y., Ng, A.Y., Manning, C.D.: Parsing natural scenes and natural language with recursive neural networks. In: Proceedings of the 28th International Conference on Machine Learning, ICML 2011 (2011)

  48. Zheng, S., Jayasumana, S., Romera-Paredes, B., Vineet, V., Su, Z., Du, D., Huang, C., Torr, P.H.S.: Conditional random fields as recurrent neural networks. In: Proceedings of the IEEE International Conference on Computer Vision (2015)

  49. Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder–decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. (2017). https://doi.org/10.1109/TPAMI.2016.2644615

    Article  Google Scholar 

  50. Bottou, L.: Large-scale machine learning with stochastic gradient descent. In: Proceedings of COMPSTAT’2010 (2010)

  51. Hautière, N., Tarel, J.P., Aubert, D., Dumont, É.: Blind contrast enhancement assessment by gradient ratioing at visible edges. Image Anal. Stereol. (2008). https://doi.org/10.5566/ias.v27.p87-95

    Article  MathSciNet  MATH  Google Scholar 

  52. He, K., Sun, J., Tang, X.: Single image haze removal using dark channel prior. IEEE Trans. Pattern Anal. Mach. Intell. (2011). https://doi.org/10.1109/TPAMI.2010.168

    Article  Google Scholar 

  53. Zhu, Q., Mai, J., Shao, L.: A fast single image haze removal algorithm using color attenuation prior. IEEE Trans. Image Process. (2015). https://doi.org/10.1109/TIP.2015.2446191

    Article  MathSciNet  MATH  Google Scholar 

  54. Yan, Q., Shen, X., Xu, L., Zhuo, S., Zhang, X., Shen, L., Jia, J.: Cross-field joint image restoration via scale map. In: Proceedings of the IEEE International Conference on Computer Vision (2013)

Download references

Acknowledgements

The current work is supported by a research grant from The Ministry of Electronics and Information Technology (MEITY), Govt. of India vide Grant No. 4(6)/2018-ITEA.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wahengbam Kanan Kumar.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kumar, W., Singh, N., Singh, A. et al. Enhanced machine perception by a scalable fusion of RGB–NIR image pairs in diverse exposure environments. Machine Vision and Applications 32, 88 (2021). https://doi.org/10.1007/s00138-021-01210-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00138-021-01210-9

Keywords

Navigation