Skip to main content
Log in

SDNet: A Versatile Squeeze-and-Decomposition Network for Real-Time Image Fusion

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

In this paper, a squeeze-and-decomposition network (SDNet) is proposed to realize multi-modal and digital photography image fusion in real time. Firstly, we generally transform multiple fusion problems into the extraction and reconstruction of gradient and intensity information, and design a universal form of loss function accordingly, which is composed of intensity term and gradient term. For the gradient term, we introduce an adaptive decision block to decide the optimization target of the gradient distribution according to the texture richness at the pixel scale, so as to guide the fused image to contain richer texture details. For the intensity term, we adjust the weight of each intensity loss term to change the proportion of intensity information from different images, so that it can be adapted to multiple image fusion tasks. Secondly, we introduce the idea of squeeze and decomposition into image fusion. Specifically, we consider not only the squeeze process from source images to the fused result, but also the decomposition process from the fused result to source images. Because the quality of decomposed images directly depends on the fused result, it can force the fused result to contain more scene details. Experimental results demonstrate the superiority of our method over the state-of-the-arts in terms of subjective visual effect and quantitative metrics in a variety of fusion tasks. Moreover, our method is much faster than the state-of-the-arts, which can deal with real-time fusion tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24

Similar content being viewed by others

Notes

  1. http://www.med.harvard.edu/AANLIB/home.html.

  2. http://figshare.com/articles/TNO_Image_Fusion_Dataset/1008029.

References

  • Ballester, C., Caselles, V., Igual, L., Verdera, J., & Rougé, B. (2006). A variational model for p+ xs image fusion. International Journal of Computer Vision, 69(1), 43–58.

    Article  Google Scholar 

  • Cai, J., Gu, S., & Zhang, L. (2018). Learning a deep single image contrast enhancer from multi-exposure images. IEEE Transactions on Image Processing, 27(4), 2049–2062.

    Article  MathSciNet  Google Scholar 

  • Fu, X., Lin, Z., Huang, Y., & Ding, X. (2019). A variational pan-sharpening with local gradient constraints. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 10,265–10,274

  • Goshtasby, A. A. (2005). Fusion of multi-exposure images. Image and Vision Computing, 23(6), 611–618.

    Article  Google Scholar 

  • Ha, Q., Watanabe, K., Karasawa, T., Ushiku, Y., & Harada, T. (2017). Mfnet: Towards real-time semantic segmentation for autonomous vehicles with multi-spectral scenes. In: Proceedings of the International Conference on Intelligent Robots and Systems, pp. 5108–5115.

  • Haghighat, M., Razian, M.A. (2014). Fast-fmi: non-reference image fusion metric. In: Proceedings of the IEEE International Conference on Application of Information and Communication Technologies, pp. 1–3.

  • Hayat, N., & Imran, M. (2019). Ghost-free multi exposure image fusion technique using dense sift descriptor and guided filter. Journal of Visual Communication and Image Representation, 62, 295–308.

    Article  Google Scholar 

  • Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K.Q. (2017). Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708.

  • Kong, S. G., Heo, J., Boughorbel, F., Zheng, Y., Abidi, B. R., Koschan, A., et al. (2007). Multiscale fusion of visible and thermal ir images for illumination-invariant face recognition. International Journal of Computer Vision, 71(2), 215–233.

    Article  Google Scholar 

  • Kumar, B. S. (2013). Multifocus and multispectral image fusion based on pixel significance using discrete cosine harmonic wavelet transform. Signal, Image and Video Processing, 7(6), 1125–1143.

    Article  Google Scholar 

  • Lai, S.H., Fang, M. (1998). Adaptive medical image visualization based on hierarchical neural networks and intelligent decision fusion. In: Proceedings of the IEEE Neural Networks for Signal Processing Workshop, pp. 438–447.

  • Lee, S.h., Park, J.S., Cho, N.I. (2018). A multi-exposure image fusion based on the adaptive weights reflecting the relative pixel intensity and global gradient. In: Proceedings of the IEEE International Conference on Image Processing, pp. 1737–1741.

  • Li, H., & Wu, X. J. (2018). Densefuse: A fusion approach to infrared and visible images. IEEE Transactions on Image Processing, 28(5), 2614–2623.

    Article  MathSciNet  Google Scholar 

  • Li, H., Wu, X. J., & Kittler, J. (2020). Mdlatlrr: A novel decomposition method for infrared and visible image fusion. IEEE Transactions on Image Processing, 29, 4733–4746.

    Article  Google Scholar 

  • Li, S., Kang, X., & Hu, J. (2013). Image fusion with guided filtering. IEEE Transactions on Image Processing, 22(7), 2864–2875.

    Article  Google Scholar 

  • Li, S., Yin, H., & Fang, L. (2012). Group-sparse representation with dictionary learning for medical image denoising and fusion. IEEE Transactions on Biomedical Engineering, 59(12), 3450–3459.

    Article  Google Scholar 

  • Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., & Zitnick, C.L. (2014). Microsoft coco: Common objects in context. In: Proceedings of the European Conference on Computer Vision, pp. 740–755.

  • Liu, Y., Chen, X., Cheng, J., & Peng, H. (2017). A medical image fusion method based on convolutional neural networks. In: Proceedings of the International Conference on Information Fusion, pp. 1–7.

  • Liu, Y., Chen, X., Peng, H., & Wang, Z. (2017). Multi-focus image fusion with a deep convolutional neural network. Information Fusion, 36, 191–207.

    Article  Google Scholar 

  • Liu, Y., Liu, S., & Wang, Z. (2015). Multi-focus image fusion with dense sift. Information Fusion, 23, 139–155.

    Article  Google Scholar 

  • Liu, Y., & Wang, Z. (2014). Simultaneous image fusion and denoising with adaptive sparse representation. IET Image Processing, 9(5), 347–357.

    Article  Google Scholar 

  • Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.

    Article  Google Scholar 

  • Ma, B., Zhu, Y., Yin, X., Ban, X., Huang, H., & Mukeshimana, M. (2020). Sesf-fuse: An unsupervised deep model for multi-focus image fusion. Neural Computing and Applications pp. 1–12.

  • Ma, J., Chen, C., Li, C., & Huang, J. (2016). Infrared and visible image fusion via gradient transfer and total variation minimization. Information Fusion, 31, 100–109.

    Article  Google Scholar 

  • Ma, J., Jiang, X., Fan, A., Jiang, J., & Yan, J. (2021). Image matching from handcrafted to deep features: A survey. International Journal of Computer Vision, 129(1), 23–79.

    Article  MathSciNet  Google Scholar 

  • Ma, J., Liang, P., Yu, W., Chen, C., Guo, X., Wu, J., et al. (2020). Infrared and visible image fusion via detail preserving adversarial learning. Information Fusion, 54, 85–98.

    Article  Google Scholar 

  • Ma, J., Xu, H., Jiang, J., Mei, X., & Zhang, X. P. (2020). Ddcgan: A dual-discriminator conditional generative adversarial network for multi-resolution image fusion. IEEE Transactions on Image Processing, 29, 4980–4995.

    Article  Google Scholar 

  • Ma, J., Yu, W., Chen, C., Liang, P., Guo, X., & Jiang, J. (2020). Pan-gan: An unsupervised pan-sharpening method for remote sensing image fusion. Information Fusion, 62, 110–120.

    Article  Google Scholar 

  • Ma, J., Yu, W., Liang, P., Li, C., & Jiang, J. (2019). Fusiongan: A generative adversarial network for infrared and visible image fusion. Information Fusion, 48, 11–26.

    Article  Google Scholar 

  • Ma, K., Li, H., Yong, H., Wang, Z., Meng, D., & Zhang, L. (2017). Robust multi-exposure image fusion: A structural patch decomposition approach. IEEE Transactions on Image Processing, 26(5), 2519–2532.

    Article  MathSciNet  Google Scholar 

  • Naidu, V., & Raol, J. R. (2008). Pixel-level image fusion using wavelets and principal component analysis. Defence Science Journal, 58(3), 338–352.

    Article  Google Scholar 

  • Nejati, M., Samavi, S., & Shirani, S. (2015). Multi-focus image fusion using dictionary-based sparse representation. Information Fusion, 25, 72–84.

    Article  Google Scholar 

  • Paul, S., Sevcenco, I. S., & Agathoklis, P. (2016). Multi-exposure and multi-focus image fusion in gradient domain. Journal of Circuits, Systems and Computers, 25(10), 1650123.

    Article  Google Scholar 

  • Piella, G. (2003). A general framework for multiresolution image fusion: From pixels to regions. Information Fusion, 4(4), 259–280.

    Article  Google Scholar 

  • Piella, G. (2009). Image fusion for enhanced visualization: A variational approach. International Journal of Computer Vision, 83(1), 1–11.

    Article  Google Scholar 

  • Prabhakar, K.R., Srikar, V.S., & Babu, R.V. (2017). Deepfuse: A deep unsupervised approach for exposure fusion with extreme exposure image pairs. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4714–4722.

  • Roberts, J. W., Van Aardt, J. A., & Ahmed, F. B. (2008). Assessment of image fusion procedures using entropy, image quality, and multispectral classification. Journal of Applied Remote Sensing, 2(1),

    Article  Google Scholar 

  • Shen, J., Zhao, Y., Yan, S., Li, X., et al. (2014). Exposure fusion using boosting laplacian pyramid. IEEE Transactions on Cybernetics, 44(9), 1579–1590.

    Article  Google Scholar 

  • Shen, X., Yan, Q., Xu, L., Ma, L., & Jia, J. (2015). Multispectral joint image restoration via optimizing a scale map. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(12), 2518–2530.

    Article  Google Scholar 

  • Szeliski, R., Uyttendaele, M., & Steedly, D. (2011). Fast poisson blending using multi-splines. In: Proceedings of the IEEE International Conference on Computational Photography, pp. 1–8.

  • Vedaldi, A., Fulkerson, B. (2010). Vlfeat: An open and portable library of computer vision algorithms. In: Proceedings of the ACM International Conference on Multimedia, pp. 1469–1472.

  • Xing, L., Cai, L., Zeng, H., Chen, J., Zhu, J., & Hou, J. (2018). A multi-scale contrast-based image quality assessment model for multi-exposure image fusion. Signal Processing, 145, 233–240.

    Article  Google Scholar 

  • Xu, H., Ma, J., Jiang, J., Guo, X., & Ling, H. (2020). U2fusion: A unified unsupervised image fusion network. IEEE Transactions on Pattern Analysis and Machine Intelligence.

  • Xu, H., Ma, J., & Zhang, X. P. (2020). Mef-gan: Multi-exposure image fusion via generative adversarial networks. IEEE Transactions on Image Processing, 29, 7203–7216.

    Article  Google Scholar 

  • Zhang, H., Xu, H., Xiao, Y., Guo, X., & Ma, J. (2020). Rethinking the image fusion: A fast unified image fusion network based on proportional maintenance of gradient and intensity. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 12,797–12,804.

  • Zhao, F., Xu, G., & Zhao, W. (2019). Ct and mr image fusion based on adaptive structure decomposition. IEEE Access, 7, 44002–44009.

    Article  Google Scholar 

  • Zhou, F., Hang, R., Liu, Q., & Yuan, X. (2019). Pyramid fully convolutional network for hyperspectral and multispectral image fusion. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 12(5), 1549–1558.

  • Zhu, Z., Zheng, M., Qi, G., Wang, D., & Xiang, Y. (2019). A phase congruency and local laplacian energy based multi-modality medical image fusion method in nsct domain. IEEE Access, 7, 20811–20824.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jiayi Ma.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Additional information

Communicated by Ioannis Gkioulekas.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, H., Ma, J. SDNet: A Versatile Squeeze-and-Decomposition Network for Real-Time Image Fusion. Int J Comput Vis 129, 2761–2785 (2021). https://doi.org/10.1007/s11263-021-01501-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-021-01501-8

Keywords

Navigation