当前位置: X-MOL 学术Entropy › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Ordinal Decision-Tree-Based Ensemble Approaches: The Case of Controlling the Daily Local Growth Rate of the COVID-19 Epidemic
Entropy ( IF 2.1 ) Pub Date : 2020-08-07 , DOI: 10.3390/e22080871
Gonen Singer 1 , Matan Marudi 2
Affiliation  

In this research, we develop ordinal decision-tree-based ensemble approaches in which an objective-based information gain measure is used to select the classifying attributes. We demonstrate the applicability of the approaches using AdaBoost and random forest algorithms for the task of classifying the regional daily growth factor of the spread of an epidemic based on a variety of explanatory factors. In such an application, some of the potential classification errors could have critical consequences. The classification tool will enable the spread of the epidemic to be tracked and controlled by yielding insights regarding the relationship between local containment measures and the daily growth factor. In order to benefit maximally from a variety of ordinal and non-ordinal algorithms, we also propose an ensemble majority voting approach to combine different algorithms into one model, thereby leveraging the strengths of each algorithm. We perform experiments in which the task is to classify the daily COVID-19 growth rate factor based on environmental factors and containment measures for 19 regions of Italy. We demonstrate that the ordinal algorithms outperform their non-ordinal counterparts with improvements in the range of 6–25% for a variety of common performance indices. The majority voting approach that combines ordinal and non-ordinal models yields a further improvement of between 3% and 10%.

中文翻译:


基于序数决策树的集成方法:控制 COVID-19 流行病每日局部增长率的案例



在本研究中,我们开发了基于序数决策树的集成方法,其中使用基于目标的信息增益度量来选择分类属性。我们证明了使用 AdaBoost 和随机森林算法的方法对于基于各种解释因素对流行病传播的区域日增长因素进行分类的任务的适用性。在这样的应用中,一些潜在的分类错误可能会产生严重的后果。该分类工具将通过深入了解当地遏制措施与每日增长因子之间的关系来跟踪和控制疫情的传播。为了最大程度地从各种序数和非序数算法中受益,我们还提出了一种集成多数投票方法,将不同的算法组合到一个模型中,从而充分利用每种算法的优势。我们进行了实验,其任务是根据意大利 19 个地区的环境因素和遏制措施对每日 COVID-19 增长率因子进行分类。我们证明序数算法优于非序数算法,对于各种常见的性能指标,其改进范围为 6-25%。结合序数模型和非序数模型的多数投票方法可进一步提高 3% 到 10% 的性能。
更新日期:2020-08-07
down
wechat
bug