Machine learning approaches to identify thresholds in a heat-health warning system context,The Journal of the Royal Statistical Society, Series A (Statistics in Society)

当前位置： X-MOL 学术 › J. R. Stat. Soc. A › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Machine learning approaches to identify thresholds in a heat-health warning system context
The Journal of the Royal Statistical Society, Series A (Statistics in Society) ( IF 1.5 ) Pub Date : 2021-08-23 , DOI: 10.1111/rssa.12745
Pierre Masselot ₁ , Fateh Chebana ₂ , Céline Campagna _{2,

3} , Éric Lavigne _{4,

5} , Taha B.M.J. Ouarda ₂ , Pierre Gosselin _{2,

3,

6}

Affiliation

During the last two decades, a number of countries or cities established heat-health warning systems in order to alert public health authorities when some heat indicator exceeds a predetermined threshold. Different methods were considered to establish thresholds all over the world, each with its own strengths and weaknesses. The common ground is that current methods are based on exposure-response function estimates that can fail in many situations. The present paper aims at proposing several data-driven methods to establish thresholds using historical data of health issues and environmental indicators. The proposed methods are model-based regression trees (MOB), multivariate adaptive regression splines (MARS), the patient rule-induction method (PRIM) and adaptive index models (AIM). These methods focus on finding relevant splits in the association between indicators and the health outcome but do it in different fashions. A simulation study and a real-world case study hereby compare the discussed methods. Results show that proposed methods are better at predicting adverse days than current thresholds and benchmark methods. The results nonetheless suggest that PRIM is overall the more reliable method with low variability of results according to the scenario or case.

中文翻译：

在高温健康预警系统环境中识别阈值的机器学习方法

在过去的二十年中，一些国家或城市建立了高温健康预警系统，以便在某些高温指标超过预定阈值时向公共卫生当局发出警报。世界各地都考虑了不同的方法来建立阈值，每种方法都有自己的优点和缺点。共同点是当前的方法基于暴露-反应函数估计，在许多情况下可能会失败。本文旨在提出几种数据驱动的方法，以使用健康问题和环境指标的历史数据来建立阈值。所提出的方法是基于模型的回归树（MOB）、多元自适应回归样条（MARS）、患者规则归纳法（PRIM）和自适应指数模型（AIM）。这些方法侧重于在指标和健康结果之间的关联中找到相关的分裂，但以不同的方式进行。模拟研究和实际案例研究在此比较了所讨论的方法。结果表明，与当前的阈值和基准方法相比，所提出的方法在预测不利天数方面更好。尽管如此，结果表明 PRIM 总体上是更可靠的方法，根据场景或案例，结果的可变性较低。

更新日期：2021-10-31

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文