A machine learning approach to modelling the spatial variations in the daily fine particulate matter (PM2.5) and nitrogen dioxide (NO2) of Shanghai, China,Environment and Planning B: Urban Analytics and City Science

当前位置： X-MOL 学术 › Environ. Plan. B Urban Anal. City Sci. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A machine learning approach to modelling the spatial variations in the daily fine particulate matter (PM2.5) and nitrogen dioxide (NO2) of Shanghai, China
Environment and Planning B: Urban Analytics and City Science ( IF 2.6 ) Pub Date : 2021-01-11 , DOI: 10.1177/2399808320975031
Xin-Yi Song ₁ , Ya Gao ₁ , Yubo Peng ₂ , Sen Huang ₃ , Chao Liu ₄ , Zhong-Ren Peng ₂

Affiliation

It is challenging to forecast high-resolution spatial-temporal patterns of intra-urban air pollution and identify impacting factors at the regional scale. Studies have attempted to capture features of air pollutants such as fine particulate matter (PM_2.5) and nitrogen dioxide (NO₂) using land use regression models, but this method overlooks the multi-collinearity of factors, non-linear correlations between factors and air pollutants, and it fails to perform well when processing daily data. However, machine learning is a feasible approach for establishing persuasive intra-urban air pollution daily variation models. In this article, random forest is utilised to establish intra-urban PM_2.5 and NO₂ spatial-temporal variation models and is compared to the traditional land use regression method. Taking the city of Shanghai, China as the case area, 36 station-measured daily records in two and a half years of PM_2.5 and NO₂ concentrations were collected. And over 80 different predictors associated with meteorological and geographical conditions, transportation, community population density, land use and points of interest are used to construct the land use regression and random forest models. Results from the two methods are compared and impacting factors identified. Explained variance (R²) is used to quantify and compare model performance. The final land use regression model explains 49.3% and 42.2% of the spatial variation in ambient PM_2.5 and NO₂, respectively, whereas the random forest model explains 78.1% and 60.5% of the variance. Regression mappings for unsampled sites on a grid pattern of 1 km × 1 km are also implemented. The random forest model is shown to perform much better than the land use regression model. In general, the findings suggest that the random forest approach offers a robust improvement in predicting performance compared to the land use regression model in estimating daily spatial variations in ambient PM_2.5 and NO₂.

更新日期：2021-01-11

点击分享查看原文

点击收藏

阅读更多本刊最新论文