当前位置: X-MOL 学术J. Affect. Disord. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Development of a machine learning-based risk prediction model for cerebral infarction and comparison with nomogram model
Journal of Affective Disorders ( IF 4.9 ) Pub Date : 2022-07-23 , DOI: 10.1016/j.jad.2022.07.045
Xuewen Li 1 , Yiting Wang 1 , Jiancheng Xu 1
Affiliation  

Background

Development of a cerebral infarction (CI) risk prediction model by mining routine test big data with machine learning algorithms.

Methods

Cohort 1 included 2017 CI patients and health checkers, and the optimal machine learning algorithms in Extreme gradient Boosting (XgBoost), Logistic Regression (LR), Support Vector Machine (SVM), Random Forest (RF) were selected to mine all routine test data of the enrolled subjects for screening CI model features. Cohort 2 included patients with CI and Non-CI from 2018 to 2020 to develop an early warning model for CI and was analyzed in subgroups with a cutoff of 50 years. Cohort 3 included CI patients versus Non-CI patients in 2021, and a nomogram models was developed for comparison with the machine learning model.

Results

The optimal algorithm XgBoost was used to develop a CI risk prediction model CI-Lab8 containing eight characteristics of fibrinogen, age, glucose, mean erythrocyte hemoglobin concentration, albumin, neutrophil absolute value, activated partial thromboplastin time, and triglycerides. The model had an AUC of 0.823 in cohort 2, significantly higher than the FIB (AUC = 0.737), which ranked first in feature importance. CI-Lab8 also had higher diagnostic accuracy in CI patients <50 years of age (AUC = 0.800), slightly lower than in CI patients ≥50 years of age (AUC = 0.856). Receiver operating characteristic curve, calibration curve, and decision curve analysis in cohort 3 showed CI-Lab8 to be superior to nomogram.

Conclusion

In this study, the CI risk prediction model developed by XgBoost algorithm outperformed the nomogram model and had higher diagnostic accuracy for CI patients in both <50 and ≥50 years old, which may assist clinical assessment for CI.



中文翻译:

基于机器学习的脑梗死风险预测模型的开发及与列线图模型的比较

背景

通过机器学习算法挖掘常规测试大数据,开发脑梗塞 (CI) 风险预测模型。

方法

Cohort 1 包括 2017 年的 CI 患者和健康检查员,选择极限梯度提升 (XgBoost)、逻辑回归 (LR)、支持向量机 (SVM)、随机森林 (RF) 中的最优机器学习算法来挖掘所有常规测试数据筛选 CI 模型特征的登记对象。队列 2 包括 2018 年至 2020 年的 CI 和非 CI 患者,以开发 CI 的早期预警模型,并以 50 年为截止值的亚组进行分析。队列 3 包括 2021 年的 CI 患者与非 CI 患者,并开发了列线图模型以与机器学习模型进行比较。

结果

优化算法XgBoost用于开发CI风险预测模型CI-Lab8,包含纤维蛋白原、年龄、葡萄糖、平均红细胞血红蛋白浓度、白蛋白、中性粒细胞绝对值、活化部分促凝血酶原激酶时间和甘油三酯8个特征。该模型在队列 2 中的 AUC 为 0.823,显着高于 FIB(AUC = 0.737),后者在特征重要性方面排名第一。CI-Lab8 在小于 50 岁的 CI 患者(AUC = 0.800)中也具有更高的诊断准确性,略低于≥50 岁的 CI 患者(AUC = 0.856)。队列 3 中的受试者工作特征曲线、校准曲线和决策曲线分析显示 CI-Lab8 优于列线图。

结论

在本研究中,XgBoost算法开发的CI风险预测模型优于列线图模型,对<50岁和≥50岁的CI患者具有更高的诊断准确性,可能有助于CI的临床评估。

更新日期:2022-07-25
down
wechat
bug