当前位置: X-MOL 学术Environ. Int. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
The sensitivity of satellite-based PM2.5 estimates to its inputs: Implications to model development in data-poor regions
Environment International ( IF 11.8 ) Pub Date : 2018-10-06 , DOI: 10.1016/j.envint.2018.09.051
Guannan Geng , Nancy L. Murray , Howard H. Chang , Yang Liu

Exposure to fine particulate matter (PM2.5) has been associated with a wide range of negative health outcomes. The overwhelming majority of the epidemiological studies that helped establish such associations was conducted in regions with sufficient ground observations and other supporting data, i.e., the data-rich regions. However, air pollution health effects research in the data-poor regions, where pollution levels are often the highest, is still very limited due to the lack of high-quality exposure estimates. To improve our understanding of the desired input datasets for the application of satellite-based PM2.5 exposure models in data-poor areas, we applied a Bayesian ensemble model in the southeast U.S. that was selected as a representative data-rich region. We designed four groups of sensitivity tests to simulate various data-poor scenarios. The factors considered that would influence the model performance included the temporal sampling frequency of the monitors, the number of ground monitors, the accuracy of the chemical transport model simulation of PM2.5 concentrations, and different combinations of the additional predictors. While our full model achieved a 10-fold cross-validated (CV) R2 of 0.82, we found that when reducing the sampling frequency from the current 1-in-3 day to 1-in-9 day, the CV R2 decreased to 0.58, and the predictions could not capture the daily variations of PM2.5. Half of the current stations (i.e., 30 monitors) could still support a robust model with a CV R2 of 0.79. With 20 monitors, the CV R2 decreased from 0.71 to 0.55 when 100% additional random errors were added to the original CMAQ simulations. However, with a sufficient number of ground monitors (e.g., 30 monitors), our Bayesian ensemble model had the ability to tolerate CMAQ errors with only a slight decrease in CV R2 (from 0.79 to 0.75). With fewer than 15 monitors, our full model collapsed and failed to fit any covariates, while the models with only time-varying variables could still converge even with only five monitors left. A model without the land use parameters lacked fine spatial details in the prediction maps, but could still capture the daily variability of PM2.5 (CV R2 ≥ 0.67) and might support a study of the acute health effects of PM2.5 exposure.



中文翻译:

卫星PM 2.5估算对其输入的敏感性:对数据贫乏地区模型开发的影响

暴露于细颗粒物(PM 2.5)与各种负面健康后果有关。有助于建立这种关联的绝大多数流行病学研究是在具有足够地面观测和其他支持数据的区域,即数据丰富的区域进行的。但是,由于缺乏高质量的暴露量估算,在数据匮乏的地区,空气污染对健康的影响研究通常是最高的,该地区的污染水平通常是最高的。为了增进我们对基于卫星的PM 2.5应用所需的输入数据集的了解在数据贫乏地区的暴露模型,我们在美国东南部应用了贝叶斯集成模型,该模型被选为代表性的数据丰富区域。我们设计了四组敏感性测试,以模拟各种数据匮乏的情况。认为会影响模型性能的因素包括监测器的时间采样频率,地面监测器的数量,PM 2.5浓度的化学迁移模型模拟的准确性以及其他预测变量的不同组合。虽然我们的完整模型实现了0.82的10倍交叉验证(CV)R 2,但我们发现,当将采样频率从当前的1分3天减少为1分9天时,CV R 2降低到0.58,并且该预测无法捕获PM 2.5的每日变化。当前工作站的一半(即30个监视器)仍可以支持CV R 2为0.79的稳健模型。对于20个监视器,当将100%附加的随机误差添加到原始CMAQ模拟中时,CV R 2从0.71降低到0.55。但是,在有足够数量的地面监视器(例如30个监视器)的情况下,我们的贝叶斯集成模型能够容忍CMAQ错误,而CV R 2仅略有下降(从0.79到0.75)。如果监视器少于15个,我们的完整模型就会崩溃,并且无法拟合任何协变量,而只有时变变量的模型仍然可以收敛,即使只剩下五个监视器也是如此。没有土地使用参数的模型缺乏在预计地图精细的空间细节,但仍然可以捕捉PM的日变2.5(CV [R 2  ≥0.67),并可能支持PM的急性健康影响的研究2.5曝光。

更新日期:2018-10-06
down
wechat
bug