当前位置: X-MOL 学术Int. J. Med. Inform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Prediction of early childhood obesity with machine learning and electronic health record data
International Journal of Medical Informatics ( IF 4.9 ) Pub Date : 2021-04-09 , DOI: 10.1016/j.ijmedinf.2021.104454
Xueqin Pang 1 , Christopher B Forrest 2 , Félice Lê-Scherban 3 , Aaron J Masino 4
Affiliation  

Objective

This study compares seven machine learning models developed to predict childhood obesity from age > 2 to ≤ 7 years using Electronic Healthcare Record (EHR) data up to age 2 years.

Materials and methods

EHR data from of 860,510 patients with 11,194,579 healthcare encounters were obtained from the Children’s Hospital of Philadelphia. After applying stringent quality control to remove implausible growth values and including only individuals with all recommended wellness visits by age 7 years, 27,203 (50.78 % male) patients remained for model development. Seven machine learning models were developed to predict obesity incidence as defined by the Centers for Disease Control and Prevention (age/sex adjusted BMI>95th percentile). Model performance was evaluated by multiple standard classifier metrics and the differences among seven models were compared using the Cochran's Q test and post-hoc pairwise testing.

Results

XGBoost yielded 0.81 (0.001) AUC, which outperformed all other models. It also achieved statistically significant better performance than all other models on standard classifier metrics (sensitivity fixed at 80 %): precision 30.90 % (0.22 %), F1-socre 44.60 % (0.26 %), accuracy 66.14 % (0.41 %), and specificity 63.27 % (0.41 %).

Discussion and conclusion

Early childhood obesity prediction models were developed from the largest cohort reported to date. Relative to prior research, our models generalize to include males and females in a single model and extend the time frame for obesity incidence prediction to 7 years of age. The presented machine learning model development workflow can be adapted to various EHR-based studies and may be valuable for developing other clinical prediction models.



中文翻译:

利用机器学习和电子健康记录数据预测儿童早期肥胖

客观的

这项研究使用电子医疗记录(EHR)数据直至2岁,比较了7种机器学习模型,这些模型用于预测从2岁到≤7岁的儿童肥胖。

材料和方法

来自费城儿童医院的860,510位患者的EHR数据来自11,194,579次医疗护理。在实施严格的质量控制以消除令人难以置信的生长值并仅包括在7岁之前具有所有推荐的健康就诊机会的个人之后,仍有27,203名(50.78%的男性)患者需要进行模型开发。根据疾病控制与预防中心的定义,开发了七个机器学习模型来预测肥胖发生率(年龄/性别调整后的BMI> 95%)。通过多个标准分类器指标评估模型性能,并使用Cochran's Q检验和事后成对检验比较了七个模型之间的差异。

结果

XGBoost产生了0.81(0.001)AUC,优于其他所有模型。与其他所有模型相比,在标准分类器指标(灵敏度固定为80%)上,它在统计上也表现出了显着更好的性能:精度30.90%(0.22%),F1-socre 44.60%(0.26%),精度66.14%(0.41%)和特异性63.27%(0.41%)。

讨论和结论

儿童肥胖预测模型是根据迄今为止最大的队列研究开发的。相对于先前的研究,我们的模型一般将男性和女性包括在一个模型中,并将肥胖发生率预测的时间范围扩展到7岁。提出的机器学习模型开发工作流程可以适应各种基于EHR的研究,并且对于开发其他临床预测模型可能很有价值。

更新日期:2021-04-16
down
wechat
bug