Full length articleRXDNFuse: A aggregated residual dense network for infrared and visible image fusion
Introduction
Image fusion is an important modern enhancement technique that aims to fuse multiple input images into a robust and informative image, and offers more complex and detailed scene representation, which can facilitate subsequent processing or help in decision making [1], [2]. Image fusion technology has been used to enhance the performance in terms of human visual perception, object detection, and target recognition [3], [4], [5], [6], [7]. In particular, infrared and visible image fusion plays an especially significant role in the scene detection and the tracking performance of the video surveillance system. These two types of images can provide complementary scene information from different aspects. The infrared images can easily distinguish targets from the background due to their significantly discriminative thermal radiations, which can work well at all day/night time and under all weather conditions. However, the infrared images usually lack texture and thus ineffectively describe the details. By contrast, visible images contain textural details with high spatial resolution, which is conducive to enhance the ability of target recognition and conforms to the human visual system. Therefore, how to effectively combine complementary information becomes the main focus of the fusion methods.
The image fusion task has been developed with different schemes in recent years. Existing fusion methods can be roughly divided into two categories. (i) traditional methods. Most typically, multiscale transform methods have been applied to extract image salient features, such as the discrete wavelet transform (DWT) [8], [9], [10]. The representation learning-based methods have also attracted great attention, such as sparse representation (SR) [11] and joint sparse representation (JSR) [12]. The subspace-based methods [13], [14], saliency-based methods [15], [16] and hybrid models [17], [18], [19] have also been applied to the image fusion task. (ii) Deep learning-based methods. Given the rapid advances in deep learning technology, the convolutional neural network(CNN) [20], [21] is used to obtain the image features and reconstruct the fused image. The CNN-based [22], [23] methods have obtained better performance in image processing, due to the strong fitting ability of neural networks. Thus, it has been widely used in the field of fusion tasks. Ma et al. [24] proposed an unsupervised network to generate the decision map for fusion. Ma et al. [25] proposed an end-to-end model called FusionGAN, which generates a fused image with dominant infrared intensities and additional visible gradients.
Although existing methods can achieve better results in corresponding fusion tasks, they still have some drawbacks which affect the image fusion performance. First, the current fusion rules in most current methods are increasingly complex and designed in a manual way, these rules introduce certain artifacts into the fusion results. Second, in CNN-based fusion methods, only the output of the last feature extraction layer is used as the image fusion component. This approach undoubtedly discards large amounts of useful information obtained by the middle convolutional extraction layer, which directly affects the final fusion performance. Third, the existing fusion methods usually lack competitiveness in terms of time and storage space due to their computational complexity and large amount of parameters.
To overcome the abovementioned challenges, we propose an end-to-end network, namely RXDNFuse, to perform infrared and visible image fusion task. This network does not require manually designed fusion rules and can effectively utilize the deep features extracted from the source images. More specifically, infrared thermal radiation information is characterized by pixel intensities, while textural detail information in visible images is typically characterized by edges and gradients [26], the preservation of details in the source image often determines the clarity of the fusion image. To further improve this performance, we design two loss functions strategies, namely pixel-wise strategy and feature-wise strategy, to force the fused image to have more texture details. Furthermore, a new feature extraction module RXDB is designed to further lighten the fusion framework, so as to improve the time efficiency of image fusion. A schematic illustration of different image fusion methods is shown in Fig. 1.
The characteristics and contributions of our work can be summarized in the following four aspects. First, we propose an end-to-end fusion architecture based on the aggregated residual dense network to solve the infrared and visible image fusion problem. Our approach effectively avoids the need for manually designing complicated image decomposition measurement and fusion rules and adequately utilizes the hierarchical features from source images. Second, we propose two loss function strategies to optimize the model similarity constraint and the quality of detailed information, where the pixel-wise strategy directly exploits the original information from the source images, and the feature-wise strategy calculates a more detailed loss function based on a pre-trained VGG-19 network [28]. Third, we conduct experiments on public infrared and visible image fusion datasets with qualitative and quantitative comparisons to state-of-the-art methods. Compared to five existing methods, the fusion results of the proposed RXDNFuse obtains excellent visual quality in the background information, while also containing highlighted thermal radiation target information. Finally, we generalize RXDNFuse to fuse images with different resolutions and RGB scale images, enabling it to generate clear and natural fused images.
The remainder of this paper is structured as follows. In Section 2, we briefly review related works on deep learning frameworks. Section 3 presents the details of the proposed RXDNFuse network for infrared and visible image fusion. Abundant experimental results and analysis are illustrated in Section 4. Finally, Section 5 provides a discussion and summarizes the paper.
Section snippets
Related works
In this section, we briefly review the advances and relevant works in the image fusion field, including traditional infrared and visible image fusion methods, VGG network deep learning models, typical deep learning-based image process techniques, and their improved variants.
Traditional fusion methods. With the fast-growing demand and progress of image representation, there are numerous infrared and visible image fusion methods have been proposed. As reconstruction is usually an inverse process
Method
This section describes the proposed RXDNFuse for infrared and visible image fusion in detail. We start by presenting the problem formulation to describe the details of feature processing, and then discuss the detailed structure of our RXDNFuse. Finally, we design two strategy options for the loss function of our network, and elaborate on other details for model training.
Experimental results and analysis
In this section, we first briefly analyze the datasets and metrics in experiments. Subsequently, the effectiveness of RXDNFuse is verified through qualitative and quantitative evaluations of RXDNFuse and five state-of-the-art fusion methods on the TNO dataset,1 INO dataset2 and OTCBVS dataset.3
Conclusions
In this paper, we propose an effective infrared and visible image fusion method based on the aggregated residual dense network. The proposed RXDNFuse is an end-to-end model, which can effectively avoid the manual design of image decomposition measurement and fusion rules. It can simultaneously keep better the thermal radiation information in infrared images and the texture detail information in visible images. Specifically, our fused results look like detailed visible images with clear
CRediT authorship contribution statement
Yongzhi Long: Conceptualization, Methodology, Validation, Formal analysis, Visualization, Software, Writing - original draft. Haitao Jia: Conceptualization, Methodology, Validation, Formal analysis, Visualization. Yida Zhong: Resources, Writing - review & editing, Supervision, Data curation. Yadong Jiang: Writing - review & editing. Yuming Jia: Writing - review & editing, Supervision.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This work was supported by the Sichuan Science and Technology Project (Grant nos. 2018GZDZX003, 2020YFG0306, 2020YFG0055 and 2020YFG0327), and the Science and Technology Program of Hebei (Grant nos. 19255901D and 20355901D).
References (51)
- et al.
Infrared and visible image fusion methods and applications: A survey
Inf. Fusion
(2019) - et al.
Integrated multilevel image fusion and match score fusion of visible and infrared face images for robust face recognition
Pattern Recognit.
(2008) - et al.
Fusion of color and infrared video for moving human detection
Pattern Recognit.
(2007) - et al.
Multisensor image fusion using the wavelet transform
Graph. Models Image Process.
(1995) - et al.
Performance comparison of different multi-resolution transforms for image fusion
Inf. Fusion
(2011) - et al.
Fusion method for infrared and visible images by using non-negative sparse representation
Infrared Phys. Technol.
(2014) - et al.
Adaptive fusion method of visible light and infrared images based on non-subsampled shearlet transform and fast non-negative matrix factorization
Infrared Phys. Technol.
(2014) - et al.
Infrared image enhancement through saliency feature analysis based on multi-scale decomposition
Infrared Phys. Technol.
(2014) - et al.
A general framework for image fusion based on multi-scale transform and sparse representation
Inf. Fusion
(2015) - et al.
Infrared and visible image fusion based on visual saliency map and weighted least square optimization
Infrared Phys. Technol.
(2017)
Infrared and visible image fusion via gradient transfer and total variation minimization
Inf. Fusion
A fusion algorithm for infrared and visible images based on adaptive dual-channel unit-linking pcnn in nsct domain
Infrared Phys. Technol.
Multi-focus image fusion with a deep convolutional neural network
Inf. Fusion
Infrared and visible image fusion with resnet and zero-phase component analysis
Infrared Phys. Technol.
Fusiongan: A generative adversarial network for infrared and visible image fusion
Inf. Fusion
Infrared and visible image fusion via detail preserving adversarial learning
Inf. Fusion
A general framework for multiresolution image fusion: from pixels to regions
Inf. Fusion
Sparse representation based multi-sensor image fusion for multi-focus and multi-modality images: A review
Inf. Fusion
Fusion of visible and infrared images using global entropy and gradient constrained regularization
Infrared Phys. Technol.
A new image quality metric for image fusion: the sum of the correlations of differences
AEU-Int. J. Electron. Commun.
Remote sensing image fusion based on two-stream fusion network
Inf. Fusion
From multi-scale decomposition to non-multi-scale decomposition methods: a comprehensive survey of image fusion techniques and its applications
IEEE Access
Fusing concurrent visible and infrared videos for improved tracking performance
Opt. Eng.
Infrared and visible image fusion based on saliency detection and infrared target segment
Fusion of thermal infrared and visible spectrum video for robust surveillance
Cited by (84)
Multi-focus image fusion via interactive transformer and asymmetric soft sharing
2024, Engineering Applications of Artificial IntelligenceSCGRFuse: An infrared and visible image fusion network based on spatial/channel attention mechanism and gradient aggregation residual dense blocks
2024, Engineering Applications of Artificial IntelligenceMPCFusion: Multi-scale parallel cross fusion for infrared and visible images via convolution and vision Transformer
2024, Optics and Lasers in EngineeringTGF: Multiscale transformer graph attention network for multi-sensor image fusion
2024, Expert Systems with ApplicationsFusionCPP: Cooperative fusion of infrared and visible light images based on PCNN and PID control systems
2024, Optics and Lasers in EngineeringImage fusion via hierarchical extractor and maximum feature distribution learning
2023, Infrared Physics and Technology