FW-GAN: Underwater image enhancement using generative adversarial network with multi-scale fusion

https://doi.org/10.1016/j.image.2022.116855Get rights and content

Highlights

  • We propose a multi-scale fusion generator network architecture. Based on the analysis of the underwater environment, an adaptive fusion strategy is proposed to fuse the multi-source and multi-scale features, which can effectively correct the color casts and haze of the image and improve its contrast, it can also avoid blind enhancement of the image and improve the generalization capability of the model.

  • We propose a decoder model combined with channel attention to compute the attention of the prior and decoded feature maps in the fusion process and adjust them adaptively. The aim is to learn the potential associations between the priori features of fusion and the enhanced results.

  • We conducted qualitative and quantitative evaluations and compared FW-GAN with traditional methods and state-of-the-art models. The results show that FW-GAN has good generalization capability and competitive performance. Finally, we conduct an ablation study to demonstrate the contribution of each core component in our network.

Abstract

Underwater robots have broad applications in many fields such as ocean exploration, ocean pasture and environmental monitoring. However, due to the inference of light scattering and absorption, selective color attenuation, suspended particles and other complex factors in the underwater environment, it is difficult for robot vision sensors to obtain high-quality underwater image signal, which is the bottleneck problem that restricts the visual perception of underwater robots. In this paper, we propose a multi-scale fusion generative adversarial network named Fusion Water-GAN (FW-GAN) to enhance the underwater image quality. The proposed model has four convolution branches, these branches refine the features of the three prior inputs and encode the original input, then fuse prior features using the proposed multi-scale fusion connections, and finally use the channel attention decoder to generate satisfactory enhanced results. We conduct qualitative and quantitative comparison experiments on real-world and synthetic distorted underwater image datasets under various degradation conditions. The results show that compared with the recent state-of-the-art underwater image enhancement methods, our proposed method achieves higher quantitative metrics scores and better generalization capability. In addition, the ablation study demonstrated the contribution of each component.

Introduction

Since the 21st century, with the rapid development of global science and technology, mankind has accelerated the exploration and resource development of the marine environment, all kinds of underwater robots have wide applications in military and livelihood tasks among high-risk and highly polluted waters, such as underwater archaeology and topographic mapping, marine environment observation, marine resources investigation, marine security defense and marine pasture. Visual perception is one of the important means for underwater robots to capture environmental information, it plays an important role in improving autonomous navigation and intelligent operation of robots [1]. However, underwater environments often tend to be harsh and it is difficult for the robot’s vision sensors to obtain high-quality underwater images. This is mainly because the image quality is seriously affected by the complex physical and chemical conditions of underwater environments. For example, the selective attenuation of atmospheric light usually makes underwater images show a bluish or greenish tone; particles suspended underwater absorb most of the light and change its direction before the reflected light reaches the camera, resulting in the degradation of low contrast, blur and haze of the image. The above nonlinear distortion of underwater images will seriously restrict the performance of high-level vision tasks, such as image segmentation [2], object classification and detection [3]. Therefore, studying high-quality underwater image enhancement methods to overcome the deficiency of visual perception caused by image distortion so that make underwater robots have a better sight is a key scientific problem that needs to be solved urgently [4], [5].

To improve the underwater image quality, the existing methods mainly include model-based image restoration method, non-model-based image enhancement method and deep learning-based image enhancement method [1]. Since the physical and chemical conditions differ greatly in different waters, the physical model-based restoration method needs to obtain the empirical parameters of water conditions in advance, which results in the lack of generalization. Traditional enhancement methods can achieve attractive visual effects by adjusting the distribution of image pixel values. However, due to the lack of an underwater imaging model, some scenes are over-/under-enhanced, and some noises and artifacts are introduced.

In recent years, the theory and technology of deep learning have developed rapidly. Convolutional Neural Network (CNN) and Generative Adversarial Network (GAN) [6] have made encouraging progress in many visual computing tasks, especially in the fields of image super-resolution [7], [8], deblurring [9] and deraining [10]. Meanwhile, the underwater image enhancement method based on deep learning has also made good progress [4], [11], [12], but there is still much room for improvement, especially in the performance and generalization capability of the method. To solve those problems, this paper proposes a multi-scale fusion generative adversarial network named Fusion Water-GAN (FW-GAN) to learn the nonlinear mapping between distorted underwater images and high-quality underwater images. The main contributions of this paper are summarized as follows:

  • 1.

    We propose a multi-scale fusion generator network architecture. Based on the analysis of the underwater environment, an adaptive fusion strategy is proposed to fuse the multi-source and multi-scale features, which can effectively correct the color casts and haze of the image and improve its contrast, it can also avoid blind enhancement of the image and improve the generalization capability of the model.

  • 2.

    We propose a decoder model combined with channel attention to compute the attention of the prior and decoded feature maps in the fusion process and adjust them adaptively. The aim is to learn the potential associations between the prior features of fusion and the enhanced results.

  • 3.

    We conducted qualitative and quantitative evaluations and compared FW-GAN with traditional methods and state-of-the-art deep learning-based models. The results show that FW-GAN can effectively improve the quality of underwater images, it has good generalization capability and competitive performance. We carried out application tests and proved that the proposed method had a positive effect on other visual tasks. Finally, we conduct an ablation study to demonstrate the contribution of each core component in our network.

Section snippets

Related works

In the past few years, many underwater image restoration and enhancement methods have been proposed to improve the quality of underwater images. The existing underwater image quality improvement methods can be divided into the following three categories.

Network architecture

In the proposed FW-GAN network, we designed the generator of encoder–decoder structure based on the U-Net. The difference is we propose to use fusion connections instead of skip-connections between the encoding layer and decoding layer of the corresponding size of the network structure. In order to better fuse various prior features into the image decoding learning process, we designed a decoder combined with channel attention. The fusion connections and channel attention decoder are described

Experiment

In this section, we first describe the implementation details and then introduce the experimental settings. In order to evaluate our method, we compared it with the traditional and the state-of-the-art methods under both synthetic and real-world underwater datasets. Finally, we conducted ablation experiments to verify each core component of FW-GAN.

Conclusion

This paper presents an underwater image enhancement model based on GAN. The model has a generator with multiple inputs and convolution branches. We input three prior images to refine the prior features, and then these features are fused into the encoding and decoding process of the original input image through the multi-scale fusion strategy to obtain the enhanced underwater image. Besides, we designed a channel attention decoder to learn the correlation and dependence between prior features to

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work was supported in part by the Key Area Research Projects of Universities of Guangdong Province under Grant 2019KZDZX1026, in part by the Natural Science Foundation of Guangdong Province under Grant 2020A1515110255, in part by the Science and Technology Program of Guangzhou under Grant 202102080591, in part by the Natural Science Foundation of China under Grant 61603103, in part by the Innovation Team Project of Universities of Guangdong Province under Grant 2020KCXTD015.

References (61)

  • LiH. et al.

    DewaterNet: A fusion adversarial real underwater image enhancement network

    Signal Process., Image Commun.

    (2021)
  • ZhouJ. et al.

    Underwater image restoration via backscatter pixel prior and color compensation

    Eng. Appl. Artif. Intell.

    (2022)
  • FuZ. et al.

    Twice mixing: a rank learning based quality assessment approach for underwater image enhancement

    Signal Process., Image Commun.

    (2022)
  • WangY. et al.

    An experimental-based review of image enhancement and image restoration methods for underwater imaging

    IEEE Access

    (2019)
  • O’ByrneM. et al.

    Semantic segmentation of underwater imagery using deep networks trained on synthetic imagery

    J. Mar. Sci. Eng.

    (2018)
  • LinW.-H. et al.

    Roimix: Proposal-fusion among multiple images for underwater object detection

  • FabbriC. et al.

    Enhancing underwater imagery using generative adversarial networks

  • IslamM.J. et al.

    Fast underwater image enhancement for improved visual perception

    IEEE Robot. Autom. Lett.

    (2020)
  • GoodfellowI. et al.

    Generative adversarial nets

    Adv. Neural Inf. Process. Syst.

    (2014)
  • Y. Zhang, K. Li, K. Li, L. Wang, B. Zhong, Y. Fu, Image super-resolution using very deep residual channel attention...
  • LaiW.-S. et al.

    Fast and accurate image super-resolution with deep laplacian pyramid networks

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2018)
  • ZhangH. et al.

    Deep stacked hierarchical multi-patch network for image deblurring

  • ZamirS.W. et al.

    Multi-stage progressive image restoration

  • HeK. et al.

    Single image haze removal using dark channel prior

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2010)
  • P. Drews, E. Nascimento, F. Moraes, S. Botelho, M. Campos, Transmission estimation in underwater single images, in:...
  • DrewsP.L. et al.

    Underwater depth estimation and image restoration based on single images

    IEEE Comput. Graph. Appl.

    (2016)
  • Carlevaris-BiancoN. et al.

    Initial results in underwater single image dehazing

  • SongW. et al.

    A rapid scene depth estimation model based on underwater light attenuation prior for underwater image restoration

  • AkkaynakD. et al.

    Sea-thru: A method for removing water from underwater images

  • SongW. et al.

    Enhancement of underwater images with statistical model of background light and optimization of transmission map

    IEEE Trans. Broadcast.

    (2020)
  • Cited by (15)

    • CMAF: Cross-Modal Augmentation via Fusion for Underwater Acoustic Image Recognition

      2024, ACM Transactions on Multimedia Computing, Communications and Applications
    • A Swift Algorithm and Hue-Preserving Based Mechanism for Underwater Image Colour Enhancement

      2024, International Journal of Intelligent Systems and Applications in Engineering
    View all citing articles on Scopus
    View full text