Pyramid Channel-based Feature Attention Network for image dehazing

https://doi.org/10.1016/j.cviu.2020.103003Get rights and content

Highlights

  • We propose an end-to-end Pyramid Channel-based Feature Attention Network for single image dehazing, which does not need to explicitly estimate the transmission map and the atmospheric light.

  • The PCFA module can extract more informative features by the channel attention block, and fuse the complementary features in different levels in a pyramid manner.

  • A loss function that combines a mean square error loss part and an edge loss part is employed in PCFAN, which can better preserve image details.

  • Extensive experiments demonstrate that the proposed PCFAN performs favorably compared with state-of-the-art methods, in terms of quantitative accuracy and qualitative visual effect.

Abstract

Traditional deep learning-based image dehazing methods usually use the high-level features (which contain more semantic information) to remove haze in the input image, while ignoring the low-level features (which contain more detail information). In this paper, a Pyramid Channel-based Feature Attention Network (PCFAN) is proposed for single image dehazing, which leverages complementarity among different level features in a pyramid manner with channel attention mechanism. PCFAN consists of three modules: a three-scale feature extraction module, a pyramid channel-based feature attention module (PCFA), and an image reconstruction module. The three-scale feature extraction module simultaneously captures the low-level spatial structural features and the high-level contextual features in different scales. The PCFA module utilizes the feature pyramid and the channel attention mechanism, which effectively extracts interdependent channel maps and selectively aggregates the more important features in a pyramid manner for image dehazing. The image reconstruction module is used to reconstruct features to recover a clear image. Meanwhile, a loss function that combines a mean square error loss part and an edge loss part is employed in PCFAN, which can better preserve image details. Experimental results demonstrate that the proposed PCFAN outperforms existing state-of-the-art algorithms on standard benchmark datasets in terms of accuracy, efficiency, and visual effect. The code will be made publicly available.

Introduction

Image dehazing problem, which aims at recovering a clear image from a given hazy input, is one of the classical image processing problems. It has attracted significant attention in the fields of image processing and computer vision in recent decades, as the techniques of image dehazing are required in many higher-level vision tasks (Zhang et al., 2020b, Yuan et al., 2017, Liu et al., 2018, Zhang et al., 2015).

Most successful methods depend on the atmosphere scattering model (Narasimhan and Nayar, 2002), which provides an estimate of the haze-free image. It is formulated as: I(x)=t(x)J(x)+A(x)(1t(x)),where x refers to the pixel coordinates in the image plane, I denotes the observed image that is degraded by haze, and J is the haze-free scene image. The matrix A represents the global atmospheric light, and the transmission map t is the medium transmission rate which describes the portion of the light that reaches the camera sensors without being scattered. The transmission map t can be expressed as t(x)=eβd(x), where β is the scatting coefficient of the atmosphere and d(x) is the scene depth. However, the transmission map and the atmospheric light are unknown in practice. Therefore, many image dehazing methods estimate t and A from a hazy image I, and then obtain the unknown clear image J via the atmosphere scattering model.

Previous image dehazing approaches concentrate more on restoring the clear image using priors such as dark-channel prior, contrast color-lines, and haze-line prior. For example, He et al. (2010) propose a dark channel prior (DCP) based method for estimating the transmission map. Although these prior-based methods have achieved considerable success, their performances are limited because not all the images of real scenes are compatible with the predefined priors. Recently, deep learning has exhibited effectiveness in various computer vision tasks. Various convolutional neural network (CNN) based methods have been proposed to estimate the transmission map and the atmospheric light. Once the transmission map and the atmospheric light are estimated, the dehazed image is restored through the atmosphere scattering model. Generally speaking, low-level features in a CNN partly refer to the detail information, and high-level features contain more semantic information. Both of them are important for recovering a clear image, but most CNN-based methods usually use high-level features to achieve image dehazing. Moreover, these methods are based on the atmosphere scattering model. If the estimated transmission map and atmospheric light are not accurate, then the dehazed result will be of low quality.

In this work, we propose a novel end-to-end framework called Pyramid Channel-based Feature Attention Network (PCFAN) for single image dehazing, which leverages complementarity among different level features in a pyramid manner with channel attention mechanism. Specifically, PCFAN consists of three modules: a three-scale feature extraction module, a pyramid channel-based feature attention (PCFA) module, and an image reconstruction module. First, the three-scale feature extraction module extracts features at three different scales. Then, these features are fed into the PCFA module. This module extracts more important attention features by the channel-attention blocks and fuses these attention features in different levels. Finally, based on the output of PCFA, the image reconstruction module is used to restore a clear image. In addition, we introduce a training loss function that consists of two terms: the MSE loss and the Edge loss. The MSE loss is utilized to measure the pixel-wise distance, while the Edge loss promotes to generate a clean image with more details. As shown in Fig. 1, the proposed PCFAN produces a more realistic image with more details.

The main features of the proposed image dehazing method are summarized as follows.

  • We propose an end-to-end Pyramid Channel-based Feature Attention Network for single image dehazing, which does not need to explicitly estimate the transmission map and the atmospheric light.

  • The PCFA module can extract more informative features by the channel attention block, and fuse the complementary features in different levels in a pyramid manner.

  • A loss function that combines a mean square error loss part and an edge loss part is employed in PCFAN, which can better preserve image details.

  • Extensive experiments on standard benchmark datasets demonstrate that the proposed PCFAN performs favorably compared with state-of-the-art methods, in terms of quantitative accuracy and qualitative visual effect.

The rest of this paper is structured as follows: A brief review of image dehazing and the attention mechanism is given in Section 2. The proposed PCFAN method is discussed in Section 3. The experimental results are presented in Section 4, and Section 5 is a conclusion of this paper.

Section snippets

Related work

In this section, we introduce related work on both the image dehazing and the attention mechanism as follows.

Image dehazing. Recent years have witnessed great advancements in the task of single image dehazing. Many classical methods have been proposed in the existing literature to tackle this well-known ill-posed problem (Zhao et al., 2019, Hodges et al., 2019, Alajarmeh et al., 2018, He et al., 2010, Ren et al., 2016). These methods can be generally classified as either image prior-based

Network architecture

In this work, we combine the benefits of the channel-attention and pyramid operation, and propose a pyramid channel-based feature attention network (PCFAN) for image dehazing. The overall framework of PCFAN is illustrated in Fig. 2. The PCFAN consists of three modules, namely the three-scale feature extraction module, the pyramid channel-based feature attention module, and the image reconstruction module. The three-scale feature module contains three stages: The first feature extraction stage

Experiments

In this section, extensive experiments are conducted on both a synthetic dataset and a real world dataset to demonstrate the effectiveness of the proposed network. The proposed network is compared with state-of-the-art image prior-based methods and learning-based methods, including DCP (He et al. CVPR’09), DehazeNet (Cai et al. TIP’16), MSCNN (Ren et al. ECCV’16), AOD-Net (Li et al. ICCV’17), GFN (Ren et al. CVPR’18), DCPDN (Zhang et al. CVPR’18), EPDN (Qu et al. CVPR’19) and FAMEDNet (Zhang

Conclusion

In this paper, we introduce a novel end-to-end dehazing network called pyramid channel-based feature attention network (PCFAN) to tackle the challenging single image dehazing problem. PCFAN consists of a three-scale extraction module, a pyramid channel-based feature attention module, and an image reconstruction module. PCFAN is able to efficiently restore the haze-free image directly. In addition, we propose a novel Edge loss to help the network learn more detailed information. The PCFAN is

CRediT authorship contribution statement

Xiaoqin Zhang: Funding acquisition, Project administration, Supervision, Conceptualization, Writing - review & editing. Tao Wang: Methodology, Investigation, Software, Writing - original draft. Jinxin Wang: Software, Data curation. Guiying Tang: Conceptualization , Resources, Visualization, Validation. Li Zhao: Supervision, Writing - review, Formal analysis.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China [grant no. 61922064], in part by the Zhejiang Provincial Natural Science Foundation, China [grant nos. LR17F030001, LQ19F020005], in part by the Project of science and technology plans of Wenzhou City, China [grant nos. C20170008, G20150017, ZG2017016].

Xiaoqin Zhang received the B.Sc. degree in electronic information science and technology from Central South University, China, in 2005 and Ph.D. degree in pattern recognition and intelligent system from the National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, China, in 2010. He is currently a professor in Wenzhou University, China. His research interests are in pattern recognition, computer vision and machine learning. He has published more than 80

References (37)

  • AlajarmehA. et al.

    Real-time framework for image dehazing based on linear transmission and constant-time airlight estimation

    Inform. Sci.

    (2018)
  • HodgesC. et al.

    Single image dehazing using deep neural networks

    Pattern Recognit. Lett.

    (2019)
  • ZhaoD. et al.

    Multi-scale optimal fusion model for single image dehazing

    Signal Process., Image Commun.

    (2019)
  • Berman, D., Avidan, S., et al., 2016. Non-local image dehazing. In: Proceedings of IEEE Conference on Computer Vision...
  • Bluche, T., 2016. Joint line segmentation and transcription for end-to-end handwritten paragraph recognition. In:...
  • CaiB. et al.

    Dehazenet: An end-to-end system for single image haze removal

    IEEE Trans. Image Process.

    (2016)
  • Cao, C., Liu, X., Yang, Y., Yu, Y., Wang, J., Wang, Z., Huang, Y., Wang, L., Huang, C., Xu, W., et al., 2015. Look and...
  • ChenC. et al.

    Robust image and video dehazing with visual artifact suppression via gradient residual minimization

  • ChenD. et al.

    Gated context aggregation network for image dehazing and deraining

  • Fu, J., Liu, J., Tian, H., Li, Y., Bao, Y., Fang, Z., Lu, H., 2019. Dual attention network for scene segmentation. In:...
  • Girshick, R., 2015. Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision. pp....
  • HeK. et al.

    Single image haze removal using dark channel prior

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2010)
  • HeK. et al.

    Spatial pyramid pooling in deep convolutional networks for visual recognition

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2015)
  • He, K., Zhang, X., Ren, S., Sun, J., 2016. Deep residual learning for image recognition. In: Proceedings of IEEE...
  • IttiL. et al.

    A model of saliency-based visual attention for rapid scene analysis

    IEEE Trans. Pattern Anal. Mach. Intell.

    (1998)
  • Jaderberg, M., Simonyan, K., Zisserman, A., et al., 2015. Spatial transformer networks. In: Proceedings of Advances in...
  • KingmaD.P. et al.

    Adam: A method for stochastic optimization

    (2014)
  • Krizhevsky, A., Sutskever, I., Hinton, G.E., 2012. Imagenet classification with deep convolutional neural networks. In:...
  • Cited by (159)

    • High-low level task combination for object detection in foggy weather conditions

      2024, Journal of Visual Communication and Image Representation
    • Nanowires Properties and Applications: A Review Study

      2023, South African Journal of Chemical Engineering
    View all citing articles on Scopus

    Xiaoqin Zhang received the B.Sc. degree in electronic information science and technology from Central South University, China, in 2005 and Ph.D. degree in pattern recognition and intelligent system from the National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, China, in 2010. He is currently a professor in Wenzhou University, China. His research interests are in pattern recognition, computer vision and machine learning. He has published more than 80 papers in international and national journals, and international conferences, including IEEE T-PAMI, IJCV, IEEE T-IP, IEEE T-IE, IEEE T-C, ICCV, CVPR, NIPS, IJCAI, AAAI, and among others.

    Tao Wang is currently a graduate student at College of Computer Science and Artificial Intelligence, Wenzhou University, China. He received the B.Sc. degree in information and computing science from Hainan Normal University, China, in 2018. His research interests include several topics in computer vision and machine learning, such as object tracking, image/video quality restoration, adversarial learning, image-to-image translation and reinforcement learning.

    Jinxin Wang is currently a graduate student at College of Computer Science and Artificial Intelligence, Wenzhou University, China. He received his bachelor’s degree in information and computing science at Wenzhou University. His research interests include visual tracking, image generation and deep learning.

    Guiying Tang is currently a graduate student at College of Mathematics and Physics, Wenzhou University, China. She received the B.Sc. degree in the College of Mathematics and Software Science, Sichuan Normal University, China, in 2017. Her main research interest is in computer vision and deep learning, such as image quality restoration, object tracking.

    Li Zhao received the B.Sc. degree in automation in 2005 and MEng degree in control theory and control engineering in 2008 from Central South University, China. She is currently an assistant researcher in Wenzhou University. Her research interests are in pattern recognition, computer vision, and machine learning.

    View full text