The estimation of hourly PM2.5 concentrations across China based on a Spatial and Temporal Weighted Continuous Deep Neural Network (STWC-DNN)

doi:10.1016/j.isprsjprs.2022.05.011

ISPRS Journal of Photogrammetry and Remote Sensing

Volume 190, August 2022, Pages 38-55

https://doi.org/10.1016/j.isprsjprs.2022.05.011 Get rights and content

Abstract

The continuous distributions of PM_2.5 concentrations and predictor variables in the surrounding regions influence the PM_2.5 concentrations in the prediction positions notably, yet few machine learning models quantified the spatially continuous interactions between PM_2.5 concentrations and predictor variations, which limits the prediction accuracy. To fill this gap, a Spatial and Temporal Weighted Continuous Deep Neural Network (STWC-DNN) was proposed. For STWC-DNN, three sub-networks, Single Pixel Network (SPN), Multiple Station Network (MSN), and Continuous Region Network (CRN) were designed to analyze the influence of predictor variables at the prediction position, the influence of PM_2.5 concentrations from surrounding stations, and the influence of continuous raster predictor variables from surrounding pixels respectively. STWC-DNN was experimented using hourly Himawari AOD data and the outputs were compared with a series of advanced models. STWC-DNN achieved higher accuracy than existing models and the sample-based, time-based, and station-based 10-fold cross-validation (CV) R² were 0.92, 0.90, and 0.79, respectively. The principle of establishing STWC-DNN sheds useful lights on the effective use of raster predictor variables and automatic spatiotemporal weight function to better estimate PM_2.5 and other airborne pollutants based on multiple data sources. The codes of STWC-DNN are now available at https://github.com/wangzh2022/STWC-DNN.

Introduction

With the rapid urbanization and industrialization, PM_2.5 (particulate matter with an aerodynamic diameter ≤ 2.5 μm) has become a severe environmental issue in China, some urban agglomerations in particular. PM_2.5 concentrations have a significant negative effect on public health, especially the increase of cardiovascular and respiratory-related morbidity and mortality (Lim et al., 2011, Crouse et al., 2012, Kloog et al., 2013). Therefore, growing efforts have been made on a comprehensive understanding of spatiotemporal variations of PM_2.5 concentrations (Ye et al. 2018) and their anthropogenic (Hagler et al. 2007) and meteorological drivers (Chen et al., 2017, Chen et al., 2018, Chen et al., 2019a, Chen et al., 2019b). These studies usually required long time-series and large-scale PM_2.5 concentration data. Since 2013, with increasing haze episodes, ground PM_2.5 observation stations have been widely made at 1497 stations across the country. However, high-resolution and high precision PM_2.5 concentration products, which are widely required in a diversity of applications, cannot be reliably obtained based on the interpolation using sparsely and unevenly distributed ground observation stations (Zhang et al. 2018).

Thanks to the rapid progress of satellite remote sensing technology, it has become a promising approach for estimating PM_2.5 concentrations using satellite-derived aerosol optical depth (AOD) products. Recently, a series of AOD products, mainly including the Moderate Resolution Imaging Spectroradiometer (MODIS) (Fang et al., 2016, He and Huang, 2018, Hu et al., 2014), GF-1 (Zhang et al. 2018), the Visible Infrared Imaging Radiometer Suite (VIIRS) (Wu et al., 2016, Hu et al., 2017, Yao et al., 2019), and the Multiangle Imaging SpectroRadiometer (MISR) (Liu et al., 2007, Ma et al., 2014, You et al., 2015), have been employed in previous studies to estimate PM_2.5 concentrations. However, one major limitation for most AOD products is the relatively coarse temporal resolution. The temporal resolution for these AODs products is generally produced at a daily basis or even more. Retrieved daily PM_2.5 products cannot effectively meet the requirement for exploring the effects of anthropogenic activities and meteorological conditions on PM_2.5 concentrations, which generally present notable variations on an hourly basis (Chen et al. 2020).

As a geostationary satellite, Himawari-8 can provide AODs with a spatial resolution of 5 km and a temporal resolution of 10 min (Level 2) or 1 h (Level 3) (Bessho et al., 2016, Yumimoto et al., 2016). Based on the Himawari-8 Level 3 AODs products, growing scholars attempted to estimate PM_2.5 concentrations using different algorithms. Wang et al. (2017) employed an improved linear mixed-effect model (LME) for the Beijing-Tianjin-Hebei (BTH) region and achieved a coefficient of determination (R²) of 0.86. Zhang et al. (2019) employed an improved seasonal LME model for Central China (CCH), BTH, Yangtze River Delta (YRD) and Pearl River Delta (PRD) and achieved R² of 0.82, 0.84, 0.80 and 0.74 respectively. Chen et al (2019) employed a stacking model for Central and Eastern China and achieved a R² of 0.85. Wei et al. (2021) proposed a Space-Time Light Gradient Boosting Machine (STLG) model for China and achieved a R² of 0.85. Although these studies proved the feasibility of estimating hourly PM_2.5 products at regional scales, major limitations remained. LME models cannot effectively estimate PM_2.5 concentrations in areas with limited stations (Wang et al. 2017), and thus not suitable for large-scale and continuous estimation. Since the stacking model ignored spatiotemporal variations, PM_2.5 products generated using this model suffered from abnormal values, similar to the salt-and-pepper noise, and caused large prediction biases (Wu et al. 2016). STLG model used the geographical distances of prediction position to stations to generate the spatial feature which would suffer from the uneven station distribution. Therefore, a robust and comprehensive model is required for better estimating PM_2.5 concentrations at a national scale.

A key step for AOD-based PM_2.5 estimation is the exploration of AOD-PM_2.5 relationship. In early years, linear models (Engel-Cox et al., 2004, Gupta and Christopher, 2009) have been a major approach for estimating PM_2.5 concentrations using AODs. However, AOD-PM_2.5 relationship is complicated and non-linear under different emission and meteorological conditions (Yang et al. 2019). Therefore, advanced models were proposed to better explain AOD-PM_2.5 relationship. Statistical models such as Geographically Weighted Regression (GWR) model (Hu et al., 2014, Song et al., 2015), LME model (Li et al., 2015), two-stage model (Hu et al., 2014, Ma et al., 2016), and Geographically and Temporally Weighted Regression (GTWR) model (Huang et al., 2010, He and Huang, 2018), have been employed to explain the nonlinear AOD-PM_2.5 relationship by adding random effects or local effects to regression models. Nevertheless, it remains challenging for using statistical models to precisely express the complicated, uncertain nonlinear AOD-PM_2.5 relationship. Compared with the statistical models, machine learning models can fit complicated relationships (Hu et al. 2017). Researchers employed such models as random forest (RF) models (Hu et al. 2017), and neural network (NN) models (Wu et al., 2012, Li et al., 2017a, Wang et al., 2019) to predict PM_2.5 concentrations at a prediction pixel by considering various predictor variables, including AODs data, and other auxiliary variables, such as meteorological and land use data, and achieved improved prediction accuracies.

Previous studies demonstrated the potential of applying machine learning to PM_2.5 concentration prediction. Nevertheless, spatiotemporal variations were limitedly considered in these machine learning models. On one hand, PM_2.5 concentrations in surrounding stations were considered in Spatial and Temporal Random Forest (STRF) model (Wei et al. 2019), Deep Belief Network (Geoi-DBN) model (Li et al. 2017b), Geographically and Temporally Weighted Neural Network (GTWNN) model (Li et al. 2020a), and SLGT model (Wei et al. 2021). For STRF and Geoi-DBN models, the spatiotemporal variation items, calculated through the spatiotemporal weight function, are used as extra input predictor variables. For GTWNN, the spatiotemporal variations are represented in a similar approach to the GTWR, yet the linear model in GTWR model is replaced with the Neural Network (NN) model. However, for these models, the spatiotemporal weight function is fixed and not adaptive according to actual data sources. A fixed weight function may cause large uncertainties when summarizing remarkable spatiotemporal variations in large areas. The accuracy of PM_2.5 concentration prediction would be affected significantly in regions where spatiotemporal variations do not fit the weight function. The missing data and the uneven station distribution would also result in low prediction accuracy. On the other hand, the diffusion of PM_2.5 concentrations is controlled by the surrounding spatial distribution of meteorological conditions (Chen et al. 2020). Even if the predictor variables of positions are the same, different diffusion conditions in the surrounding regions would lead to different PM_2.5 concentrations of the positions. The situation cannot be modeled in the most existing machine learning models which simply consider predictor variables at the prediction position, resulting in limited prediction accuracy.

To fill these gaps, we attempt to develop a Spatial and Temporal Weighted Continuous Deep Neural Network (STWC-DNN) for better estimating hourly PM_2.5 concentrations across China using Himawari-8 AOD products. For STWC-DNN, specific networks are proposed to automatically establish spatiotemporal weight functions to better consider the discrete meteorological influences on PM_2.5 concentrations. Moreover, in addition to discrete predictor variables, we further consider spatial continuous meteorological influences on PM_2.5 concentrations at the prediction pixel by employing raster data sources. The accuracy of estimated PM_2.5 concentrations is evaluated through sample-based, time-based, and station-based 10-fold cross-validation (CV) (Li et al. 2020b). To further evaluate the performance of STWC-DNN, the three 10-fold CV results of STWC-DNN are compared with some recent models, including the Multiple Line Regression (MLR) model, GTWR model, RF model, STRF model, Geoi-DBN model, GTWNN model, and SLTG model. Since previous models rarely considered spatially continuous variable data, the principle of establishing STWC-DNN sheds useful lights on the effective inclusion of multiple raster data for better estimating the concentration of PM_2.5, ground ozone and other pollutants. The codes of STWC-DNN are now available at https://github.com/wangzh2022/STWC-DNN.

Section snippets

Data sources

The data used in the study mainly included ground observed PM_2.5 concentrations, satellite-retrieved AOD with 5 km spatial resolution and 1 h time resolution, meteorological data (Boundary Layer Height (BLH), Relative Humidity (RH), Surface Pressure (SP), etc.) and such auxiliary data related to PM_2.5 concentrations as Normalized Difference Vegetation Index (NDVI) and Digital Elevation Model (DEM). Details of employed data sources were introduced as follows.

Methods

Nonlinear PM_2.5-meteorology interactions were highly complicated and presented notable spatiotemporal variations (Chen et al., 2020). The uncertainty of AOD-PM_2.5 relationship was mainly caused by the complicated, underlying influence of predictor variables on PM_2.5 concentrations. In addition to predictor variables at the prediction pixel, the spatiotemporal distribution of PM_2.5 level and predictor variables can also have a strong influence on PM_2.5 concentrations at the prediction pixel.

Descriptive statistics

Fig. 4 shows the histograms and descriptive statistics of variables (except for DEM that does not change over time) in the entire model fitting data set. With the removal of missing data, there was a total of 465,362 observed PM_2.5 concentration samples where corresponding gridded variables were available in 2017 over China. In this data set, the annual mean PM_2.5 concentration was 58.79 ± 44.35 μg/m³. For seasons, the highest value was in winter (77.40 μg/m³) and lower values were in autumn

Discussion

Despite a satisfactory accuracy, limitations remained for STWC-DNN. Firstly, since the temporal resolution of meteorological data was 6 h, which was much lower than the Himawari-8 AOD products, the hourly temporal variations of PM_2.5 have yet been fully utilized. Some alternative sources such as the forecast products from the Goddard Earth Observing System (GEOS) (https://gmao.gsfc.nasa.gov/GMAO_products/NRT_products.php) or the meteorological data at stations, with a higher temporal resolution

Conclusions

To comprehensively understand the influence of predictor variables on PM_2.5 concentrations, we proposed a Spatial and Temporal Weighted Continuous Deep Neural Network (STWC-DNN), which employed an automatic spatiotemporal weight function and a set of raster predictor variables. STWC-DNN included three sub-networks, Single Pixel Network (SPN), Multiple Station Network (MSN) and Continuous Region Network (CRN), which aimed to consider the influence of predictor variables at the target position,

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgement

This work was supported by National Natural Science Foundation of China (Grant No. 42171399, Grant No. 41901414) and Beijing Municipal Natural Science Foundation, China (Grant No. 8202031).

References (57)

K. Bessho et al.
An introduction to Himawari-8/9—Japan’s new-generation geostationary meteorological satellites
J. Meteorol. Soc. Jpn
(2016)
J. Chen et al.
Stacking machine learning model for estimating hourly PM_2.5 in China based on Himawari 8 aerosol optical depth data
Sci. Total Environ.
(2019)
Z. Chen et al.
Detecting the causality influence of individual meteorological factors on local PM_2.5 concentration in the Jing-Jin-Ji region
Sci. Rep.
(2017)
Z. Chen et al.
Understanding meteorological influences on PM_2.5 concentrations across China: a temporal and spatial perspective
Atmos. Chem. Phys.
(2018)
Z. Chen et al.
The control of anthropogenic emissions contributed to 80% of the decrease in PM_2.5 concentrations
Atmos. Chem. Phys.
(2019)
Z. Chen et al.
Influence of meteorological conditions on PM_2.5 concentrations across china: a review of methodology and mechanism
Environ. Int.
(2020)
D.L. Crouse et al.
Risk of Nonaccidental and Cardiovascular Mortality in Relation to Long-term Exposure to Low Concentrations of Fine Particulate Matter: A Canadian National-Level Cohort Study
Environ. Health Perspect.
(2012)
Devlin, J., Chang, M., Lee, K., et al., 2018. BERT: Pre-training of Deep Bidirectional Transformers for Language...
J. Duchi et al.
Adaptive subgradient methods for online learning and stochastic optimization
J. Mach. Learn. Res.
(2011)
J.A. Engel-Cox et al.
Qualitative and quantitative evaluation of MODIS satellite sensor data for regional and urban scale air quality
Atmos. Environ.
(2004)

X. Fang et al.

Satellite-based ground PM_2.5 estimation using timely structure adaptive modeling

Remote Sens. Environ.

(2016)

S. Gao et al.

Res2Net: A New Multi-scale Backbone Architecture

IEEE Trans. Pattern Anal. Mach. Intell.

(2019)

X. Glorot et al.

Deep sparse rectifier neural networks

J. Mach. Learn. Res.

(2011)

P. Gupta et al.

Particulate matter air quality assessment using integrated surface, satellite, and meteorological products: multiple regression approach

J. Atmos. Res.: Atmos.

(2009)

G.S.W. Hagler et al.

Local and regional anthropogenic influence on PM_2.5 elements in Hong Kong

Atmos. Environ.

(2007)

K. He et al.

Deep Residual Learning for Image Recognition

IEEE Conf. Comput. Vision Pattern Recogn.

(2016)

Q. He et al.

Satellite-based mapping of daily high-resolution ground PM_2.5 in China via space-time regression modeling

Remote Sens. Environ.

(2018)

X. Hu et al.

Estimating PM_2.5 Concentrations in the Conterminous United States Using the Random Forest Approach

Environ. Sci. Technol.

(2017)

X. Hu et al.

Estimating ground-level PM_2.5 concentrations in the Southeastern United States using MAIAC AOD retrievals and a two-stage model

Remote Sens. Environ.

(2014)

B. Huang et al.

Geographically and temporally weighted regression for modeling spatio-temporal variation in house prices

Int. J. Geogr. Inf. Sci.

(2010)

Krizhevsky A., Sutskever I., Hinton G., et al. 2012. ImageNet classification with deep convolutional neural networks....

M. Kikuchi et al.

Improved Hourly Estimates of Aerosol Optical Thickness Using Spatiotemporal Variability Derived From Himawari-8 Geostationary Satellite

IEEE Trans. Geosci. Remote Sens.

(2018)

I. Kloog et al.

Long- and Short-Term Exposure to PM _2.5 and Mortality: Using Novel Exposure Models

Epidemiology

(2013)

T. Lin et al.

Feature Pyramid Networks for Object Detection

Comput. Vision Pattern Recogn.

(2017)

T. Li et al.

Point-surface fusion of station measurements and satellite observations for mapping PM_2.5 distribution in China: Methods and assessment

Atmos. Environ.

(2017)

T. Li et al.

Estimating Ground-Level PM 2.5 by Fusing Satellite and Station Observations: A Geo-Intelligent Deep Learning Approach: Deep Learning for PM 2.5 Estimation

Geophys. Res. Lett.

(2017)

R. Li et al.

Estimating ground-level PM_2.5 using fine-resolution satellite data in the megacity of Beijing, China

Aerosol Air Qual. Res.

(2015)

T. Li et al.

Geographically and temporally weighted neural networks for satellite-based mapping of ground-level PM_2.5

ISPRS J. Photogramm. Remote Sens.

(2020)

Cited by (16)

Deep-learning architecture for PM<inf>2.5</inf> concentration prediction: A review
2024, Environmental Science and Ecotechnology
Accurately predicting the concentration of fine particulate matter (PM_2.5) is crucial for evaluating air pollution levels and public exposure. Recent advancements have seen a significant rise in using deep learning (DL) models for forecasting PM_2.5 concentrations. Nonetheless, there is a lack of unified and standardized frameworks for assessing the performance of DL-based PM_2.5 prediction models. Here we extensively reviewed those DL-based hybrid models for forecasting PM_2.5 levels according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. We examined the similarities and differences among various DL models in predicting PM_2.5 by comparing their complexity and effectiveness. We categorized PM_2.5 DL methodologies into seven types based on performance and application conditions, including four types of DL-based models and three types of hybrid learning models. Our research indicates that established deep learning architectures are commonly used and respected for their efficiency. However, many of these models often fall short in terms of innovation and interpretability. Conversely, models hybrid with traditional approaches, like deterministic and statistical models, exhibit high interpretability but compromise on accuracy and speed. Besides, hybrid DL models, representing the pinnacle of innovation among the studied models, encounter issues with interpretability. We introduce a novel three-dimensional evaluation framework, i.e., Dataset-Method-Experiment Standard (DMES) to unify and standardize the evaluation for PM_2.5 predictions using DL models. This review provides a framework for future evaluations of DL-based models, which could inspire researchers to standardize DL model usage in PM_2.5 prediction and improve the quality of related studies.
Considering geographical spatiotemporal altributes for seamless air temperature data fusion with high accuracy
2024, Remote Sensing Applications: Society and Environment
High-resolution, high spatiotemporal continuity, and high-precision temperature data (3H Ta) are essential for understanding local to global climate change and for studying urban heat conditions. Our prior research demonstrated the effectiveness of combining deep learning and point-to-surface scaling to generate 3H Ta. However, the accuracy of the 3H Ta data fusion methods is influenced by the geospatial attributes of temperature itself. In this study, we investigated the effects of geographical spatio-temporal factors on the three aspects of temperature fusion, i.e., data input, deep learning-based temperature fusion, and fused temperature error calibration. The results showed that temperature data had strong geospatial attributes, exhibiting spatiotemporal autocorrelation at the microscale and different clustering characteristics in different macro spatial regions. After incorporating geospatial autocorrelation factors into the temperature fusion model, the R² was 0.995, the RMSE was 0.697 °C, and the MAE was 0.527 °C after 10-fold cross-validation. Compared with the model that does not consider spatio-temporal factors, RMSE and MAE are reduced by 68% and 73%, respectively. The use of geographical spatio-temporal difference analysis (GSTDA) error correction combining spatio-temporal factors compensated for temperature underestimation or overestimation at specific times or locations. After calibrating for fusion temperatures at the four validation sites, the RMSE and MAE decreased from 0.75 °C to 0.64 °C–0.69 °C and 0.58 °C, with RMAE and MAE decreasing by 9.37% and 10.15%, respectively. Finally, we generated 500 m daily 3H Ta data for Wuhan Metropolitan Area in 2019. The result were expended to Austin, Texas and Los Angeles, in USA. Our research results and comparative analysis confirm the necessity of considering geospatial and temporal factors in temperature fusion models, which helps generate 3H Ta.
Simulating daily PM<inf>2.5</inf> concentrations using wavelet analysis and artificial neural network with remote sensing and surface observation data
2023, Chemosphere
Accurate PM_2.5 concentrations predicting is critical for public health and wellness as well as pollution control. However, traditional methods are difficult to accurately predict PM_2.5. An adaptive model coupled with artificial neural network (ANN) and wavelet analysis (WANN) is utilized to predict daily PM_2.5 concentrations with remote sensing and surface observation data. The four evaluation metrics, namely Pearson correlation coefficient (R), mean absolute percentage error (MAPE), root mean square error (RMSE), and mean absolute error (MAE), are utilized to evaluate the performances of the artificial neural network (ANN) and WANN methods. From the predicting results, The WANN model has a higher R (R = 0.9990) during the testing period compared with R (R = 0.6844) based on the ANN model. Similarly, the WANN model has a lower MAPE (3.6988%), RMSE (1.0145 μg/m³), MAE (1.3864 μg/m³), compared with MAPE (80.0086%), RMSE (16.5838 μg/m³), MAE (12.2420 μg/m³) of the ANN. In addition, comparing the outcomes of the proposed WANN method with the ANN method, it was observed that the error during the training and verification period has decreased significantly. Furthermore, the statistical methods are used to analyze WANN and ANN, showing that WANN has higher training accuracy and better stability. Therefore, it is feasible to establish WANN to predict PM_2.5 concentrations (1 day in advance) by using remote sensing and surface observation data.
Estimation of fine-resolution PM<inf>2.5</inf> concentrations using the INLA-SPDE method
2023, Atmospheric Pollution Research
Estimating spatio-temporal PM_2.5 concentrations is a method that can be used to impute missing observations; which means that it may have important applications in the assessment of health effects. Based on observations gathered from 119 monitoring stations located within Henan Province and its surrounding areas, China, this study constructed a spatio-temporal dynamic model and used the Integrated Nested Laplace Approximation-Stochastic Partial Differential Equation (INLA-SPDE) method to estimate daily PM_2.5 concentrations in Henan Province. Local geographical variables, such as elevation, road density, and land use, were integrated as spatial covariates. Meanwhile, meteorological variables, such as precipitation, air pressure, relative humidity, temperature, and wind speed, were integrated as spatio-temporal covariates. In addition, a first-order autoregressive process and a spatially correlated random effect were explicitly specified to capture the spatio-temporal dependence of PM_2.5 concentrations. The validation results showed that the predictions were in good agreement with the observations (10-fold cross validated R²: 0.9407; Root mean square error: 10.7135 μg/m³), and the coverage probability of the 95% confidence interval was 96.04%. This study confirmed that the INLA-SPDE method can effectively model the spatio-temporal variability in PM_2.5 concentrations. We also derived estimations of PM_2.5 concentrations with a high spatio-temporal resolution, which should improve the assessment of related health effects.
The division of PM<inf>2.5</inf>-O<inf>3</inf> composite airborne pollution across China based on spatiotemporal clustering
2023, Journal of Cleaner Production
With the rapid increase of ground-level ozone concentrations, the comprehensive management of PM_2.5-O₃ composite air pollution has become one of the most pressing environmental concerns nowadays. However, due to the lack of national divisions, regional integrative management of PM_2.5-O₃ composite air pollution remains highly challenging. To fill this gap, we employed and adapted a repeated-bisection model to conduct spatiotemporal clustering of PM_2.5-O₃ composite airborne pollution across China based on multi-year airborne pollutant data in 364 cities. Specifically, two strategies were experimented: the spatiotemporal clustering of daily PM_2.5/O₃ and the spatiotemporal clustering of daily PM_2.5 and O₃ concentrations. Despite some differences, the clustering outputs from both strategies achieved a self-aggregation effect, indicating that cities with similar spatiotemporal patterns of simultaneous PM_2.5 and O₃ variations were usually located closely. This phenomenon suggests the necessity and feasibility of regional integrative management of composite airborne pollution. According to accuracy assessment based on Geographical Detector, both strategies achieved relatively satisfactory outputs. Specifically, the spatiotemporal clustering based on daily PM_2.5 and O₃ concentrations achieved a slightly better output, suggesting PM_2.5/O₃ cannot fully explain the complicated and uncertain PM_2.5-O₃ association. Based on the clustering output, we divided seven divisions of PM_2.5-O₃ composite airborne pollution across China. This research provides important decision support for conducting regional integrative management of composite airborne pollution. The framework of two-variable-oriented spatiotemporal clustering sheds useful light on the comprehensive management of multiple and mutually-interacting environmental issues.
A gap-filling hybrid approach for hourly PM<inf>2.5</inf> prediction at high spatial resolution from multi-sourced AOD data
2022, Environmental Pollution
Citation Excerpt :
Extensive studies have been conducted to model daily PM2.5 at high spatial resolutions from polar-orbiting satellite AOD, including the prediction of PM2.5 concentrations at 1-km using MAIAC AOD (Bi et al., 2019; He et al., 2021; Kloog et al., 2015; Pu and Yoo, 2020; Xiao et al., 2017; Stafoggia et al., 2019). Recently, an increasing number of studies also have attempted to predict hourly PM2.5 using AOD from geostationary satellites (Chen et al., 2019; Jiang et al., 2021; Park et al., 2019; She et al., 2020; Sun et al., 2021; Wei et al., 2021a; Wang et al., 2022). However, it is worth noting that most previous studies, except Jiang et al. (2021), either ignored the missingness of AOD or solely relied on single-platformed AOD, and consequently were not able to maximize the utility of multi-platformed AOD products in the prediction of ambient PM2.5.
Despite a growing interest in the satellite derived estimation of ground-level PM_2.5 concentrations, modeling hourly PM_2.5 levels at high spatial resolution with complete coverage for a large study domain remains a challenge. The primary modeling challenges lie in the presence of missing data in aerosol optical depth (AOD) and the limited data resolution for a single-platformed satellite AOD product. To address these issues, we developed a gap-filling hybrid approach to estimate full coverage hourly ground-level PM_2.5 concentrations at a high spatial resolution of 1 km using multi-platformed and multi-scale satellite derived AOD products. Specifically, we filled the gaps and downscaled the multi-sourced AOD from Geostationary Ocean Color Imager (GOCI), Multi-Angle Implementation of Atmospheric Correction (MAIAC), and Modern-Era Retrospective Analysis for Research and Applications - version 2 (MERRA-2), using a hybrid data fusion approach. The fused hourly AOD with full coverage was then used for hourly PM_2.5 predictions at a high spatial resolution of 1 km. We demonstrated the application of the proposed approach and assessed its performance using the data collected from northeastern Asia from 2015 to 2019. Our fused hourly AOD data showed high accuracy with the mean absolute error of 0.14 and correlation coefficient of 0.94, in validation against Aerosol Robotic Network (AERONET) AOD. Our AOD-based PM_2.5 prediction model showed a good prediction accuracy with cross-validated R² of 0.85 and root mean squared error of 12.40 μg/m³, respectively. Given that the highly resolved PM_2.5 predictions captured both the temporal trend and the peak of PM_2.5 pollution scenarios, we concluded that the proposed hybrid approach can effectively combine multi-sourced satellite AOD and derive subsequent PM_2.5 distributions at high spatial and temporal resolutions.

View all citing articles on Scopus

View full text

The estimation of hourly PM2.5 concentrations across China based on a Spatial and Temporal Weighted Continuous Deep Neural Network (STWC-DNN)

Abstract

Introduction

Section snippets

Data sources

Methods

Descriptive statistics

Discussion

Conclusions

Declaration of Competing Interest

Acknowledgement

An introduction to Himawari-8/9—Japan’s new-generation geostationary meteorological satellites

J. Meteorol. Soc. Jpn

Stacking machine learning model for estimating hourly PM2.5 in China based on Himawari 8 aerosol optical depth data

Sci. Total Environ.

Detecting the causality influence of individual meteorological factors on local PM2.5 concentration in the Jing-Jin-Ji region

Sci. Rep.

Understanding meteorological influences on PM2.5 concentrations across China: a temporal and spatial perspective

Atmos. Chem. Phys.

The control of anthropogenic emissions contributed to 80% of the decrease in PM2.5 concentrations

Atmos. Chem. Phys.

Influence of meteorological conditions on PM2.5 concentrations across china: a review of methodology and mechanism

Environ. Int.

Risk of Nonaccidental and Cardiovascular Mortality in Relation to Long-term Exposure to Low Concentrations of Fine Particulate Matter: A Canadian National-Level Cohort Study

Environ. Health Perspect.

Adaptive subgradient methods for online learning and stochastic optimization

J. Mach. Learn. Res.

Qualitative and quantitative evaluation of MODIS satellite sensor data for regional and urban scale air quality

Atmos. Environ.

Satellite-based ground PM2.5 estimation using timely structure adaptive modeling

Remote Sens. Environ.

Res2Net: A New Multi-scale Backbone Architecture

IEEE Trans. Pattern Anal. Mach. Intell.

Deep sparse rectifier neural networks

J. Mach. Learn. Res.

Particulate matter air quality assessment using integrated surface, satellite, and meteorological products: multiple regression approach

J. Atmos. Res.: Atmos.

Local and regional anthropogenic influence on PM2.5 elements in Hong Kong

Atmos. Environ.

Deep Residual Learning for Image Recognition

IEEE Conf. Comput. Vision Pattern Recogn.

Satellite-based mapping of daily high-resolution ground PM2.5 in China via space-time regression modeling

Remote Sens. Environ.

Estimating PM2.5 Concentrations in the Conterminous United States Using the Random Forest Approach

Environ. Sci. Technol.

Estimating ground-level PM2.5 concentrations in the Southeastern United States using MAIAC AOD retrievals and a two-stage model

Remote Sens. Environ.

Geographically and temporally weighted regression for modeling spatio-temporal variation in house prices

Int. J. Geogr. Inf. Sci.

Improved Hourly Estimates of Aerosol Optical Thickness Using Spatiotemporal Variability Derived From Himawari-8 Geostationary Satellite

IEEE Trans. Geosci. Remote Sens.

Long- and Short-Term Exposure to PM 2.5 and Mortality: Using Novel Exposure Models

Epidemiology

Feature Pyramid Networks for Object Detection

Comput. Vision Pattern Recogn.

Point-surface fusion of station measurements and satellite observations for mapping PM2.5 distribution in China: Methods and assessment

Atmos. Environ.

Estimating Ground-Level PM 2.5 by Fusing Satellite and Station Observations: A Geo-Intelligent Deep Learning Approach: Deep Learning for PM 2.5 Estimation

Geophys. Res. Lett.

Estimating ground-level PM2.5 using fine-resolution satellite data in the megacity of Beijing, China

Aerosol Air Qual. Res.

Geographically and temporally weighted neural networks for satellite-based mapping of ground-level PM2.5

ISPRS J. Photogramm. Remote Sens.

The estimation of hourly PM_2.5 concentrations across China based on a Spatial and Temporal Weighted Continuous Deep Neural Network (STWC-DNN)

Stacking machine learning model for estimating hourly PM_2.5 in China based on Himawari 8 aerosol optical depth data

Detecting the causality influence of individual meteorological factors on local PM_2.5 concentration in the Jing-Jin-Ji region

Understanding meteorological influences on PM_2.5 concentrations across China: a temporal and spatial perspective

The control of anthropogenic emissions contributed to 80% of the decrease in PM_2.5 concentrations

Influence of meteorological conditions on PM_2.5 concentrations across china: a review of methodology and mechanism

Satellite-based ground PM_2.5 estimation using timely structure adaptive modeling

Local and regional anthropogenic influence on PM_2.5 elements in Hong Kong

Satellite-based mapping of daily high-resolution ground PM_2.5 in China via space-time regression modeling

Estimating PM_2.5 Concentrations in the Conterminous United States Using the Random Forest Approach

Estimating ground-level PM_2.5 concentrations in the Southeastern United States using MAIAC AOD retrievals and a two-stage model

Long- and Short-Term Exposure to PM _2.5 and Mortality: Using Novel Exposure Models

Point-surface fusion of station measurements and satellite observations for mapping PM_2.5 distribution in China: Methods and assessment

Estimating ground-level PM_2.5 using fine-resolution satellite data in the megacity of Beijing, China

Geographically and temporally weighted neural networks for satellite-based mapping of ground-level PM_2.5