当前位置: X-MOL 学术Environ. Sci. Technol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An Ensemble Learning Approach for Estimating High Spatiotemporal Resolution of Ground-Level Ozone in the Contiguous United States.
Environmental Science & Technology ( IF 11.4 ) Pub Date : 2020-08-18 , DOI: 10.1021/acs.est.0c01791
Weeberb J Requia 1, 2 , Qian Di 1, 3 , Rachel Silvern 4 , James T Kelly 5 , Petros Koutrakis 1 , Loretta J Mickley 4 , Melissa P Sulprizio 4 , Heresh Amini 1, 6 , Liuhua Shi 1, 7 , Joel Schwartz 1
Affiliation  

In this paper, we integrated multiple types of predictor variables and three types of machine learners (neural network, random forest, and gradient boosting) into a geographically weighted ensemble model to estimate the daily maximum 8 h O3 with high resolution over both space (at 1 km × 1 km grid cells covering the contiguous United States) and time (daily estimates between 2000 and 2016). We further quantify monthly model uncertainty for our 1 km × 1 km gridded domain. The results demonstrate high overall model performance with an average cross-validated R2 (coefficient of determination) against observations of 0.90 and 0.86 for annual averages. Overall, the model performance of the three machine learning algorithms was quite similar. The overall model performance from the ensemble model outperformed those from any single algorithm. The East North Central region of the United States had the highest R2, 0.93, and performance was weakest for the western mountainous regions (R2 of 0.86) and New England (R2 of 0.87). For the cross validation by season, our model had the best performance during summer with an R2 of 0.88. This study can be useful for the environmental health community to more accurately estimate the health impacts of O3 over space and time, especially in health studies at an intra-urban scale.

中文翻译:

一种估计美国本土地面臭氧高时空分辨率的集成学习方法。

在本文中,我们将多种类型的预测变量和三种类型的机器学习器(神经网络、随机森林和梯度提升)集成到一个地理加权集成模型中,以高分辨率估计每日最大 8 h O 3空间( 1 km × 1 km 网格单元覆盖美国本土)和时间(2000 年至 2016 年的每日估计)。我们进一步量化了我们 1 km × 1 km 网格域的每月模型不确定性。结果证明了具有平均交叉验证R 2 的高整体模型性能(决定系数)与年平均值 0.90 和 0.86 的观测值相对应。总体而言,三种机器学习算法的模型性能非常相似。集成模型的整体模型性能优于任何单一算法。美国中东北部地区的R 2最高,为 0.93,西部山区(R 2为 0.86)和新英格兰(R 2为 0.87)的表现最弱。对于按季节进行的交叉验证,我们的模型在夏季表现最佳,R 2为 0.88。这项研究有助于环境健康界更准确地估计 O 对健康的影响3在空间和时间上,尤其是在城市内部规模的健康研究中。
更新日期:2020-09-15
down
wechat
bug