Use of Stratified Cascade Learning to predict hospitalization risk with only socioeconomic factors.,Journal of Biomedical informatics

当前位置： X-MOL 学术 › J. Biomed. Inform. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Use of Stratified Cascade Learning to predict hospitalization risk with only socioeconomic factors.
Journal of Biomedical informatics ( IF 4.0 ) Pub Date : 2020-02-20 , DOI: 10.1016/j.jbi.2020.103393
Anton Filikov ₁ , Sayali Pethe ₁ , Robert Kelley ₂ , Anne Fischer ₂ , Ron Ozminkowski ₂

Affiliation

BACKGROUND AND OBJECTIVE Published models predicting health related outcomes rely on clinical, claims and social determinants of health (SDH) data. Addressing the challenge of predicting with only SDH we developed a novel framework termed Stratified Cascade Learning (SCL) and used it for predicting the risk of hospitalization (ROH). MATERIALS AND METHODS The variable set includes 27 SDH and "age" and "sex" for a cohort of diabetic patients. The SCL model uses three sub-models: SM1 (whole training set) stratifies training set into "predictable" and "unpredictable" subsets, SM2 (built on whole training set) classifies test set patients into "predictable" and "unpredictable", and SM3 (built on only the "predictable" subset) predicts the ROH for the patients classified as "predictable" by SM2. RESULTS The SCL model does not improve either the AUC or the NPV of the basic classifier, but materially improves accuracy and specificity measures at the expense of lowering sensitivity for the "predictable" subset. Optimization of the risk thresholds of the sub-models does not noticeably change the AUC and NPV but further improves the accuracy and specificity at the expense of further lowering sensitivity. CONCLUSION Since the SLC model yields low sensitivity it fails to predict high risk patients. But it yields high specificity that can be useful when the objective is to eliminate low-risk patients as candidates for further testing or treatment. The use of the SCL is not limited to healthcare, it can be applied to any predictive modeling problem when reliable predictions can only be made for a fraction of incoming data.

中文翻译：

仅通过社会经济因素使用分层的级联学习来预测住院风险。

背景和目的预测健康相关结果的已发布模型依赖于健康的临床，主张和社会决定因素（SDH）数据。为了解决仅使用SDH进行预测的挑战，我们开发了一种称为分层层叠学习（SCL）的新颖框架，并将其用于预测住院风险（ROH）。材料和方法该变量集包括27个SDH，以及一组糖尿病患者的“年龄”和“性别”。SCL模型使用三个子模型：SM1（整个训练集）将训练集分为“可预测”和“不可预测”子集，SM2（建立在整个训练集上）将测试集患者分为“可预测”和“不可预测”，以及SM3（仅基于“可预测”子集）可预测归类为“

更新日期：2020-02-20

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11