Retrieval of cloud top properties from advanced geostationary satellite imager measurements based on machine learning algorithms

https://doi.org/10.1016/j.rse.2019.111616Get rights and content

Highlights

  • A novel machine learning algorithm to retrieve cloud top height using Himawari-8

  • Significant improvements in cloud top height product from machine learning algorithm

  • A joint algorithm further reduces uncertainty in cloud top height.

Abstract

The cloud-top height (CTH) product derived from passive satellite instrument measurements is often used to make climate data records (CDR). CALIPSO (Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observations) provides CTH parameters with high accuracy, but with limited temporal-spatial resolution. Recently, the Advanced Himawari Imager (AHI) onboard Japanese Himawari-8/-9, provides high temporal (every 10 min) and high spatial (2 km at nadir) resolution measurements with 16 spectral bands. This paper reports on a study to derive the CTH from combined AHI and CALIPSO using advanced machine learning (ML) algorithms with better accuracy than that from the traditional physical (TRA) algorithms. We find significant CTH improvements (1.54–2.72 km for mean absolute error, MAE) from four different machine learning algorithms (original MAE from TRA method is about 3.24 km based on CALIPSO data validation), particularly in high and optically thin clouds. In addition, we also develop a joint algorithm to combine optimal machine learning and traditional physical (TRA) algorithms of CTH to further reduce MAE to 1.53 km and enhance the layered accuracy (CTH < 18 km). While the ML-based algorithm improves CTH retrieval over the TRA algorithm, the lower or higher clouds still exhibit relatively large uncertainty. Combining both methods provides the better CTH than either alone. The combined approach could be used to process data from advanced geostationary imagers for climate and weather applications.

Introduction

As an essential role in the global weather and climate systems, various cloud properties are able to influence the radiation budget at the top and surface of the atmosphere (Baker, 1997; Sassen et al., 2007). The cloud top height (CTH, or cloud top pressure, CTP) is of particular importance for determining longwave radiation at the surface and aviation safety (Holz et al., 2008). In general, CTH can be derived from passive satellite multichannel imaging measurements, often using infrared (IR) window (IRW), CO2-slicing, and one-dimensional variational (1DVAR) methods (Heidinger and Pavolonis, 2009; Li et al., 2001; Menzel et al., 2008) based on physical properties of clouds. Some conventional spaceborne imagers, such as AVHRR (Advanced Very High Resolution Radiometer), HIRS (High resolution Infrared Radiation Sounder), MODIS (Moderate Resolution Imaging Spectroradiometer), and VIIRS (Visible infrared Imaging Radiometer), have produced CTH climate data records (CDR) (Baum et al., 2012b) to help us to further understand the Earth climate system. Since most of the CTH retrieval methods of passive sensors involve a radiative transfer model (RTM), and usually a RTM in cloudy skies has large uncertainties (Li et al., 2017; Li et al., 2013), CTHs have limited accuracy for optically thin or broken clouds (Baum et al., 2012a). Compared with passive sensors, the global observations from the spaceborne lidar CALIOP (Cloud-Aerosol Lidar with Orthogonal Polarization) onboard the CALIPSO (Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observations) satellite mission use laser return signals to retrieve CTH with higher accuracy, but with limited spatial coverage (nadir only) and limited temporal resolution (Li et al., 2018; Liu et al., 2019a; Winker et al., 2010). The CTHs derived from CALISPO measurements are seen as truth to validate the corresponding CTH product from passive sensor (Holz et al., 2008). Some previous studies also pointed out the significant biases or underestimations in the CTH product from passive sensor measurements (Baum et al., 2012a; Holz et al., 2008; Weisz et al., 2007). The biases or underestimations are mainly attributed to the CALIOP's high sensitivity to high and optically thin cirrus (Holz et al., 2008), which are most likely missed by passive sensor. In addition, there is the intrinsic difference in the CTHs from lidar versus IR measurements; lidar captures the very top while IR estimates the cloud radiation from optical depth of one.

In recent years, the Advanced Himawari Imager (AHI) (Husi et al., 2019) onboard Himawari-8/-9 (H8/9), the Advanced Baseline Imager (ABI) (Schmit et al., 2005; Schmit et al., 2009) onboard the new generation of the Geostationary Operational Environment Satellite (GOES)-R series, and the Advanced Geostationary Radiation Imager (AGRI) onboard Fengyun-4A (FY-4A) (Yang et al., 2017) have been successfully launched into the geostationary (GEO) orbit, which provide high temporal (every 10 to 15 min full disk coverage and more frequently for regional coverage) and high spatial (0.5–2 km at nadir for AHI and ABI, 0.5–4 km for AGRI) resolution measurements in 16 (14 for AGRI) spectral bands (Min et al., 2017). Thereby, combining spatially and temporally collocated CALIPSO and GEO imager (i.e. H8-9/AHI) measurements offers a good opportunity to establish retrievals of CTH with both high spatial and high temporal resolutions.

In addition, advanced machine learning (ML) techniques, such as K-nearest-neighbor (KNN), random forests (RF), support vector machines (SVM), artificial neural network (ANN), deep learning (DL, one kind of complex ANN algorithm), etc., offers a possible solution to some non-linear issues in remote sensing and geoscience fields (Kühnlein et al., 2014a; Kühnlein et al., 2014b; Min et al., 2019). A previous study (Håkansson et al., 2018) used a neural network algorithm to train and CTP and CTH for several passive sensors in polar-orbit. Here, advanced ML techniques (details in the Appendix B section) are used to build a connection between CALISPO and GEO imager CTH determinations.

With 16 spectral bands (of which 10 are infrared) viewing the same cloud system, H8/AHI has the capability to depict the cloud top properties with IR measurements during both day and night. The primary goal of this investigation is to derive CTHs using H8/AHI measurements day and night and to avoid using the radiative transfer model (RTM) in clear or cloudy skies. A spatially and temporally collocated AHI, CALIPSO, and numerical weather prediction (NWP) model dataset is used for training to develop a statistical model based on ML methods. Then, the statistical or prediction model is applied to the H8/AHI IR band measurements for deriving CTH products with high temporal and spatial resolutions over the full earth disk. Some independent validation tests will be conducted to compare the CTH results from the traditional 1DVAR and ML-based methods. This study also addresses the following questions related to new ML-based CTH retrieval approach: (1) How to derive CTH from combined H8/AHI radiances and CALIOP measurements? (2) Is there any CTH improvement from ML-based algorithm over other algorithms such as 1DVAR?

Section 2 briefly introduces H8/AHI, CALIPSO, and global forecast system (GFS) NWP data. In Section 3, the traditional physical (TRA) cloud-top pressure (CTP, which can be converted to CTH) algorithm is described along with validations for H8/AHI data. Section 4 introduces four classical ML algorithms, chooses and evaluates the optimal statistical model for CTH retrieval. Section 5 shows the CTH results from the new ML-based algorithm and a joint algorithm, and discusses the possible impact factors affecting the ML-based method. Finally, Section 6 provides a summary. In addition, two appendices are attached at the end of this study to further interpret the TRA and four ML algorithms.

Section snippets

Data

The new-generation Japanese geostationary meteorological satellite, Himawari-8 has been successfully launched into geosynchronous orbit on October 7, 2014 introducing the new AHI. Located at 140.7° E, AHI provides 16 bands of full disk earth-viewing imagery in visible (VIS, 4 bands), near-infrared (NIR, 2 bands), and infrared (IR, 10 bands) bands (central wavelengths from 0.47–13.4 μm) every 10 min with 0.5 (VIS, 1 band), 1.0 (VIS, 2 bands), and 2.0 (NIR/IR, 13 bands) km horizontal resolutions (

Algorithm description

The classical CO2-slicing algorithm (Menzel et al., 2008) for Aqua/Terra MODIS Collection 6 is used to retrieve operational cloud top properties. It uses several 13 and 14 μm spectral bands for ice clouds retrieval, and the IR-window approach (IRW) based on 11 μm band for water clouds retrieval along with a latitude dependent lapse rate for low clouds over ocean. However, it is not possible to apply or adapt this algorithm to the current or new-generation GEO satellite imager; some important 13

Four ML algorithms

In this investigation, we primarily use four classical ML algorithms to train a cloud top properties prediction model, including K-Nearest-Neighbor (KNN) (Altman, 1992; Coomans and Massart, 1982), Support Vector Machine (SVM) (Cao, 2003; Drucker et al., 1997), Random Forest (RF) (Breiman, 2001), and Gradient Boosting Decision Tree (GBDT) (Friedman, 2002). Compared with the H8/AHI CTH algorithms mentioned in the Section 3 (Heidinger and Pavolonis, 2009), a ML-based prediction retrieval algorithm

Results and discussions

After the optimal prediction model is determined, we develop a CTH retrieval program for H8/AHI data in line with the procedure in Fig. 4. Fig. 7 shows the validations of CTH of H8/AHI from TRA and ML (based on the optimal GBDT model with first guess) algorithms using CALIPSO data for the testing dataset (mentioned in Fig. 3). The sub-figures at the last column in Fig. 7 show the layered MAE, MBE, and STD of CTH for three different datasets at an interval of 1 km for TRA and ML algorithms. From

Summary

The objective of this study is to investigate a new approach for improving the CTH estimation through combined use of passive and active remote sensing measurements. The AHI radiance measurements from the first satellite of new generation of Japanese GEO series and the CALIPSO official cloud products (Version 4.1) are collocated spatially and temporally for developing the statistical CTH retrieval methods based on advanced machine learning techniques, and retrieved products are validated with

Author contributions

Min Min: Conceptualization, Methodology, Software, Investigation, Resources, Validation, Writing-Original draft preparation. Jun Li: Conceptualization, Methodology, Supervision, Software, Investigation, Data curation, Writing-Original draft preparation. Fu Wang: Software, Visualization, Investigation. Zijing Liu: Validation, Software. W. Paul Menzel: Writing-Reviewing and Editing, Supervision.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

The authors would like to acknowledge NASA, JMA, and NOAA for freely providing the MODIS (https://ladsweb.modaps.eosdis.nasa.gov/search), CALIPSO (https://subset.larc.nasa.gov/calipso/login.php), Himawari-8 (ftp.ptree.jaxa.jp), and GFS NWP (ftp://nomads.ncdc.noaa.gov/GFS/Grid4) data online. Special thanks go to the GOES-R Algorithm Working Group for guiding the TRA algorithm applications. Also the authors sincerely appreciate the power computer tools developed by the Python and scikit-learn

References (54)

  • B.A. Baum et al.

    MODIS cloud top property refinements for Collection 6

    J. Appl. Meteorol. Climatol.

    (2012)
  • L. Breiman

    Random forests

  • L. Breiman et al.

    Random Forests–Classification Manual

  • D. Chen et al.

    The cloud top distribution and diurnal variation of clouds over East Asia: preliminary results from Advanced Himawari Imager

    J. Geophys. Res.

    (2018)
  • H. Drucker et al.

    Support vector regression machines

  • J.R. Eyre et al.

    Transmittance of atmospheric gases in the microwave region: a fast model

    Appl. Opt.

    (1988)
  • J.H. Friedman

    Stochastic gradient boosting

    Comput. Stat. Data Anal.

    (2002)
  • N. Håkansson et al.

    Neural network cloud top pressure and height for MODIS

    Atmos. Meas. Tech.

    (2018)
  • A. Heidinger

    ABI cloud height

  • A. Heidinger et al.

    Gazing at cirrus clouds for 25 years through a split window, part 1: methodology

    J. Appl. Meteorol. Climatol.

    (2009)
  • A.K. Heidinger et al.

    Using CALIPSO to explore the sensitivity to cirrus height in the infrared observations from NPOESS/VIIRS and GOES-R/ABI

    J. Geophys. Res.

    (2010)
  • A.K. Heidinger et al.

    A naive Bayesian cloud-detection scheme derived from CALIPSO and applied within PATMOS-x

    J. Appl. Meteorol. Climatol.

    (2012)
  • R.E. Holz et al.

    Global Moderate resolution Imaging Spectroradiometer (MODIS) cloud detection and height evaluation using CALIOP

    J. Geophys. Res.

    (2008)
  • L. Husi et al.

    Ice cloud properties from Himawari-8/AHI next-generation geostationary satellite: capability of the AHI to monitor the DC cloud generation process

    IEEE Trans. Geosci. Remote Sens.

    (2019)
  • E. Kalnay et al.

    The NCEP/NCAR 40-year reanalysis project

    Bull. Am. Meteorol. Soc.

    (1996)
  • M. Kanamitsu

    Description of the NMC global data assimilation and forecast system

    Weather Forecast.

    (1989)
  • M.-H. Kim et al.

    The CALIPSO version 4 automated aerosol classification and lidar ratio selection algorithm

    Atmos. Meas. Tech.

    (2018)
  • Cited by (0)

    View full text