当前位置: X-MOL 学术Comput. Environ. Urban Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Predicting and mapping neighborhood-scale health outcomes: A machine learning approach
Computers, Environment and Urban Systems ( IF 7.1 ) Pub Date : 2021-01-01 , DOI: 10.1016/j.compenvurbsys.2020.101562
Chen Feng , Junfeng Jiao

Abstract Estimating health outcomes at a neighborhood scale is important for promoting urban health, yet costly and time-consuming. In this paper, we present a machine-learning-enabled approach to predicting the prevalence of six common non-communicable chronic diseases at the census tract level. We apply our approach to the City of Austin and show that our method can yield fairly accurate predictions. In searching for the best predictive models, we experiment with eight different machine learning algorithms and 60 predictor variables that characterize the social environment, the physical environment, and the aspects and degrees of neighborhood disorder. Our analysis suggests that (a) the sociodemographic and socioeconomic variables are the strongest predictors for tract-level health outcomes and (b) the historical records of 311 service requests can be a useful complementary data source as the information distilled from the 311 data often helps improve the models' performance. The machine learning models yielded from this study can help the public and city officials evaluate future scenarios and understand how changes in the neighborhood conditions can lead to changes in the health outcomes. By analyzing where the most significant discrepancies between the predicted and the actual values are, we will also be ready to identify areas of best practice and areas in need of greater investment or policy intervention.

中文翻译:

预测和绘制邻里尺度的健康结果:一种机器学习方法

摘要 在社区范围内估计健康结果对于促进城市健康很重要,但成本高昂且耗时。在本文中,我们提出了一种支持机器学习的方法,用于在人口普查层面预测六种常见的非传染性慢性病的患病率。我们将我们的方法应用于奥斯汀市,并表明我们的方法可以产生相当准确的预测。在寻找最佳预测模型的过程中,我们试验了八种不同的机器学习算法和 60 个预测变量,这些变量表征了社会环境、物理环境以及邻里混乱的方面和程度。我们的分析表明,(a) 社会人口统计学和社会经济变量是区域级健康结果的最强预测因子,(b) 311 服务请求的历史记录可以是有用的补充数据源,因为从 311 数据中提取的信息通常有助于提高模型的性能。这项研究产生的机器学习模型可以帮助公众和城市官员评估未来的情景,并了解社区条件的变化如何导致健康结果的变化。通过分析预测值和实际值之间最显着差异的地方,我们也将准备好确定最佳实践领域和需要更多投资或政策干预的领域。
更新日期:2021-01-01
down
wechat
bug