当前位置: X-MOL 学术Big Data Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Risk Prediction of Renal Failure for Chronic Disease Population Based on Electronic Health Record Big Data
Big Data Research ( IF 3.5 ) Pub Date : 2021-04-24 , DOI: 10.1016/j.bdr.2021.100234
Yujie Yang , Ye Li , Runge Chen , Jing Zheng , Yunpeng Cai , Giancarlo Fortino

Renal failure is a fatal disease raising global concerns. Previous risk models for renal failure mostly rely on the diagnosis of chronic kidney disease, which lacks obvious clinical symptoms and thus is mostly undiagnosed, causing significant omission of high-risk patients. In this paper, we proposed a framework to predict the risk of renal failure directly from a big data repository of chronic disease population without prerequisite diagnosis of chronic kidney disease. The electronic health records of 42,256 patients with hypertension or diabetes in Shenzhen Health Information Big Data Platform were collected, with 398 suffered from renal failure during a 3-year follow-up. Five state-of-the-art machine learning methods are utilized to build risk prediction models of renal failure for chronic disease population. Extensive experimental results show that the proposed framework achieves quite well performance. Particularly, the XGBoost obtains the best performance with an area under receiving-operating-characteristics curve (AUC) of 0.9139. By analyzing the effect of risk factors, we identified that serum creatine, age, urine acid, systolic blood pressure, and blood urea nitrogen are the top five factors associated with renal failure risk. Compared with existing models, our model can be deployed into routine chronic disease management procedures and enable more preemptive, widely-covered screening of renal risks, which would in turn reduce the damage caused by the disease through timely intervention.



中文翻译:

基于电子健康记录大数据的慢性病人群肾衰竭风险预测

肾衰竭是引起全球关注的致命疾病。先前的肾衰竭风险模型主要依赖于慢性肾脏疾病的诊断,而该疾病缺乏明显的临床症状,因此大多未被诊断,从而导致高危患者的重大遗漏。在本文中,我们提出了一个框架,该框架可直接从大型的慢性病人群数据存储库中预测肾衰竭的风险,而无需对慢性肾脏病进行必要的诊断。在深圳健康信息大数据平台中,收集了42256例高血压或糖尿病患者的电子健康记录,在3年的随访中,有398例患有肾功能衰竭。利用五种最先进的机器学习方法来建立慢性病人群肾衰竭的风险预测模型。大量的实验结果表明,所提出的框架具有很好的性能。特别是,XGBoost在接收-操作特性曲线(AUC)下为0.9139的区域内可获得最佳性能。通过分析危险因素的影响,我们确定血清肌酸,年龄,尿酸,收缩压和血尿素氮是与肾衰竭风险相关的前五个因素。与现有模型相比,我们的模型可以部署到常规慢性疾病管理程序中,并且能够进行先发性更广泛的肾脏风险筛查,进而可以通过及时干预减少由疾病引起的损害。XGBoost在接收-操作特性曲线(AUC)下为0.9139的区域内获得最佳性能。通过分析危险因素的影响,我们确定血清肌酸,年龄,尿酸,收缩压和血尿素氮是与肾衰竭风险相关的前五个因素。与现有模型相比,我们的模型可以部署到常规慢性疾病管理程序中,并且能够进行先发性更广泛的肾脏风险筛查,进而可以通过及时干预减少由疾病引起的损害。XGBoost在接收-操作特性曲线(AUC)下为0.9139的区域内获得最佳性能。通过分析危险因素的影响,我们确定血清肌酸,年龄,尿酸,收缩压和血尿素氮是与肾衰竭风险相关的前五个因素。与现有模型相比,我们的模型可以部署到常规慢性疾病管理程序中,并且能够进行先发性更广泛的肾脏风险筛查,进而可以通过及时干预减少由疾病引起的损害。

更新日期:2021-04-29
down
wechat
bug