当前位置: X-MOL 学术J. Endourol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Machine Learning Prediction of Kidney Stone Composition Using Electronic Health Record-Derived Features
Journal of Endourology ( IF 2.9 ) Pub Date : 2021-07-27 , DOI: 10.1089/end.2021.0211
Abin Abraham 1 , Nicholas L Kavoussi 2 , Wilson Sui 2 , Cosmin Bejan 3 , John A Capra 1, 4, 5 , Ryan Hsi 2
Affiliation  

Aims: Noninvasive prediction of kidney stone composition could direct dietary and pharmacologic preventative treatment without stone analysis. We aimed to assess the accuracy of machine learning models in predicting kidney stone composition using variables extracted from the electronic health record (EHR). Materials and Methods: We identified kidney stone patients (n=1,296) with both stone composition and 24-hour (24H) urine testing. We trained machine learning models (XGBoost [XG] and logistic regression [LR]) to predict stone composition using 24H urine data and EHR-derived demographic and comorbidity data. Models predicted either binary (calcium vs. non-calcium stone) or multiclass (calcium oxalate, uric acid, hydroxyapatite, or other) stone types. We evaluated performance using area under the receiver operating curve (ROC-AUC) and accuracy and identified predictors for each task. Results: For discriminating binary stone composition, XG outperformed LR with higher accuracy (91% vs. 71%) with ROC-AUC of 0.80 for both models. Top predictors used by these models were supersaturations of uric acid and calcium phosphate, and urinary ammonium. For multiclass classification, LR outperformed XG with higher accuracy (0.64 vs. 0.56) and ROC-AUC (0.79 vs. 0.59), and urine pH had the highest predictive utility. Overall, 24H urine analyte data contributed more to the models’ predictions of stone composition than EHR-derived variables. Conclusion: Machine learning models can predict calcium stone composition. LR outperforms XG in multiclass stone classification. Demographic and comorbidity data are predictive of stone composition; however, including 24H urine data improves performance. Further optimization of performance could lead to earlier, directed medical therapy for kidney stone patients.

中文翻译:

使用电子健康记录衍生特征对肾结石成分进行机器学习预测

目的:肾结石成分的无创预测可以在没有结石分析的情况下指导饮食和药物预防性治疗。我们旨在评估机器学习模型使用从电子健康记录 (EHR) 中提取的变量来预测肾结石成分的准确性。材料和方法:我们通过结石成分和 24 小时 (24H) 尿液检测确定了肾结石患者 (n=1,296)。我们训练了机器学习模型(XGBoost [XG] 和逻辑回归 [LR])以使用 24 小时尿液数据和 EHR 衍生的人口统计数据和合并症数据来预测结石成分。模型预测二元(钙与非钙结石)或多类(草酸钙、尿酸、羟基磷灰石或其他)结石类型。我们使用接受者操作曲线下的面积 (ROC-AUC) 和准确性评估了性能,并确定了每个任务的预测因子。结果:对于区分二元结石成分,XG 优于 LR,精度更高(91% 对 71%),两种模型的 ROC-AUC 均为 0.80。这些模型使用的主要预测因子是尿酸和磷酸钙以及尿铵的过饱和度。对于多类分类,LR 优于 XG,精度更高(0.64 对 0.56)和 ROC-AUC(0.79 对 0.59),尿液 pH 具有最高的预测效用。总体而言,24 小时尿液分析物数据对模型预测结石成分的贡献大于 EHR 衍生变量。结论:机器学习模型可以预测钙结石成分。LR 在多类结石分类中优于 XG。人口统计和合并症数据可预测结石成分;然而,包括 24 小时尿液数据可以提高性能。性能的进一步优化可能导致肾结石患者更早、更直接的药物治疗。
更新日期:2021-07-28
down
wechat
bug