当前位置: X-MOL 学术BMC Endocr. Disord. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Predictive models for diabetes mellitus using machine learning techniques
BMC Endocrine Disorders ( IF 2.8 ) Pub Date : 2019-10-15 , DOI: 10.1186/s12902-019-0436-6
Hang Lai , Huaxiong Huang , Karim Keshavjee , Aziz Guergachi , Xin Gao

Diabetes Mellitus is an increasingly prevalent chronic disease characterized by the body’s inability to metabolize glucose. The objective of this study was to build an effective predictive model with high sensitivity and selectivity to better identify Canadian patients at risk of having Diabetes Mellitus based on patient demographic data and the laboratory results during their visits to medical facilities. Using the most recent records of 13,309 Canadian patients aged between 18 and 90 years, along with their laboratory information (age, sex, fasting blood glucose, body mass index, high-density lipoprotein, triglycerides, blood pressure, and low-density lipoprotein), we built predictive models using Logistic Regression and Gradient Boosting Machine (GBM) techniques. The area under the receiver operating characteristic curve (AROC) was used to evaluate the discriminatory capability of these models. We used the adjusted threshold method and the class weight method to improve sensitivity – the proportion of Diabetes Mellitus patients correctly predicted by the model. We also compared these models to other learning machine techniques such as Decision Tree and Random Forest. The AROC for the proposed GBM model is 84.7% with a sensitivity of 71.6% and the AROC for the proposed Logistic Regression model is 84.0% with a sensitivity of 73.4%. The GBM and Logistic Regression models perform better than the Random Forest and Decision Tree models. The ability of our model to predict patients with Diabetes using some commonly used lab results is high with satisfactory sensitivity. These models can be built into an online computer program to help physicians in predicting patients with future occurrence of diabetes and providing necessary preventive interventions. The model is developed and validated on the Canadian population which is more specific and powerful to apply on Canadian patients than existing models developed from US or other populations. Fasting blood glucose, body mass index, high-density lipoprotein, and triglycerides were the most important predictors in these models.

中文翻译:

使用机器学习技术的糖尿病预测模型

糖尿病是一种越来越普遍的慢性疾病,其特征是人体无法代谢葡萄糖。这项研究的目的是建立一个具有高灵敏度和选择性的有效预测模型,以便根据患者的人口统计数据和访问医疗机构时的实验室结果,更好地识别有患糖尿病风险的加拿大患者。使用13309名年龄在18至90岁之间的加拿大患者的最新记录,以及他们的实验室信息(年龄,性别,空腹血糖,体重指数,高密度脂蛋白,甘油三酸酯,血压和低密度脂蛋白) ,我们使用Logistic回归和梯度提升机(GBM)技术构建了预测模型。接收器工作特性曲线(AROC)下的面积用于评估这些模型的区分能力。我们使用调整后的阈值方法和分类权重方法来提高敏感性-该模型正确预测的糖尿病患者比例。我们还将这些模型与其他学习机器技术(例如决策树和随机森林)进行了比较。建议的GBM模型的AROC为84.7%,灵敏度为71.6%,建议的Logistic回归模型的AROC为84.0%,灵敏度为73.4%。GBM和Logistic回归模型的性能优于随机森林和决策树模型。我们的模型使用一些常用的实验室结果来预测糖尿病患者的能力很高,并且灵敏度令人满意。这些模型可以内置到在线计算机程序中,以帮助医生预测未来会发生糖尿病的患者并提供必要的预防干预措施。该模型是在加拿大人群中开发和验证的,与从美国或其他人群中开发的现有模型相比,该模型更适用于加拿大患者,而且功能更强大。空腹血糖,体重指数,高密度脂蛋白和甘油三酸酯是这些模型中最重要的预测因子。
更新日期:2019-10-15
down
wechat
bug