当前位置: X-MOL 学术Lancet › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Machine learning-based marker for coronary artery disease: derivation and validation in two longitudinal cohorts
The Lancet ( IF 98.4 ) Pub Date : 2022-12-20 , DOI: 10.1016/s0140-6736(22)02079-7
Iain S Forrest 1 , Ben O Petrazzini 2 , Áine Duffy 2 , Joshua K Park 3 , Carla Marquez-Luna 2 , Daniel M Jordan 2 , Ghislain Rocheleau 2 , Judy H Cho 4 , Robert S Rosenson 5 , Jagat Narula 6 , Girish N Nadkarni 7 , Ron Do 8
Affiliation  

Binary diagnosis of coronary artery disease does not preserve the complexity of disease or quantify its severity or its associated risk with death; hence, a quantitative marker of coronary artery disease is warranted. We evaluated a quantitative marker of coronary artery disease derived from probabilities of a machine learning model. In this cohort study, we developed and validated a coronary artery disease-predictive machine learning model using 95 935 electronic health records and assessed its probabilities as in-silico scores for coronary artery disease (ISCAD; range 0 [lowest probability] to 1 [highest probability]) in participants in two longitudinal biobank cohorts. We measured the association of ISCAD with clinical outcomes—namely, coronary artery stenosis, obstructive coronary artery disease, multivessel coronary artery disease, all-cause death, and coronary artery disease sequelae. Among 95 935 participants, 35 749 were from the BioMe Biobank (median age 61 years [IQR 18]; 14 599 [41%] were male and 21 150 [59%] were female; 5130 [14%] were with diagnosed coronary artery disease) and 60 186 were from the UK Biobank (median age 62 [15] years; 25 031 [42%] male and 35 155 [58%] female; 8128 [14%] with diagnosed coronary artery disease). The model predicted coronary artery disease with an area under the receiver operating characteristic curve of 0·95 (95% CI 0·94–0·95; sensitivity of 0·94 [0·94–0·95] and specificity of 0·82 [0·81–0·83]) and 0·93 (0·92–0·93; sensitivity of 0·90 [0·89–0·90] and specificity of 0·88 [0·87–0·88]) in the BioMe validation and holdout sets, respectively, and 0·91 (0·91–0·91; sensitivity of 0·84 [0·83–0·84] and specificity of 0·83 [0·82–0·83]) in the UK Biobank external test set. ISCAD captured coronary artery disease risk from known risk factors, pooled cohort equations, and polygenic risk scores. Coronary artery stenosis increased quantitatively with ascending ISCAD quartiles (increase per quartile of 12 percentage points), including risk of obstructive coronary artery disease, multivessel coronary artery disease, and stenosis of major coronary arteries. Hazard ratios (HRs) and prevalence of all-cause death increased stepwise over ISCAD deciles (decile 1: HR 1·0 [95% CI 1·0–1·0], 0·2% prevalence; decile 6: 11 [3·9–31], 3·1% prevalence; and decile 10: 56 [20–158], 11% prevalence). A similar trend was observed for recurrent myocardial infarction. 12 (46%) undiagnosed individuals with high ISCAD (≥0·9) had clinical evidence of coronary artery disease according to the 2014 American College of Cardiology/American Heart Association Task Force guidelines. Electronic health record-based machine learning was used to generate an in-silico marker for coronary artery disease that can non-invasively quantify atherosclerosis and risk of death on a continuous spectrum, and identify underdiagnosed individuals. National Institutes of Health.

中文翻译:


基于机器学习的冠状动脉疾病标记:两个纵向队列的推导和验证



冠状动脉疾病的二元诊断不能保留疾病的复杂性或量化其严重程度或其与死亡相关的风险;因此,冠状动脉疾病的定量标记是必要的。我们评估了从机器学习模型的概率得出的冠状动脉疾病的定量标记。在这项队列研究中,我们使用 95 935 份电子健康记录开发并验证了冠状动脉疾病预测机器学习模型,并评估其概率作为冠状动脉疾病的计算机评分 (ISCAD;范围 0 [最低概率] 到 1 [最高概率]概率])在两个纵向生物库队列的参与者中。我们测量了 ISCAD 与临床结果的关联,即冠状动脉狭窄、阻塞性冠状动脉疾病、多支冠状动脉疾病、全因死亡和冠状动脉疾病后遗症。在 95 935 名参与者中,35 749 名来自 BioMe 生物库(中位年龄 61 岁 [IQR 18];14 599 名 [41%] 为男性,21 150 名 [59%] 为女性;5130 名 [14%] 患有冠状动脉疾病疾病)和 60 186 名来自英国生物银行(中位年龄 62 [15] 岁;25 031 [42%] 男性和 35 155 [58%] 女性;8128 [14%] 被诊断为冠状动脉疾病)。该模型预测冠状动脉疾病的受试者工作特征曲线下面积为 0·95(95% CI 0·94–0·95;敏感性为 0·94 [0·94–0·95],特异性为 0·95) 82 [0·81–0·83]) 和 0·93 (0·92–0·93;灵敏度为 0·90 [0·89–0·90],特异性为 0·88 [0·87–0 ·88])在 BioMe 验证和保留集中分别为 0·91(0·91–0·91;灵敏度为 0·84 [0·83–0·84],特异性为 0·83 [0·91] 82–0·83])在英国生物银行外部测试集中。 ISCAD 从已知风险因素、汇总队列方程和多基因风险评分中捕获冠状动脉疾病风险。冠状动脉狭窄随着 ISCAD 四分位数的上升而定量增加(每四分位数增加 12 个百分点),包括阻塞性冠状动脉疾病、多支冠状动脉疾病和主要冠状动脉狭窄的风险。全因死亡的危险比 (HR) 和患病率在 ISCAD 十分位数范围内逐步增加(十分位数 1:HR 1·0 [95% CI 1·0–1·0],0·2% 患病率;十分位数 6:11 [3 ·9–31],3·1% 患病率;十分之一:56 [20–158],11% 患病率)。复发性心肌梗塞也观察到类似的趋势。根据 2014 年美国心脏病学会/美国心脏协会工作组指南,12 名 (46%) 未确诊的高 ISCAD (≥0·9) 个体有冠状动脉疾病的临床证据。基于电子健康记录的机器学习被用来生成冠状动脉疾病的计算机标记,可以连续无创地量化动脉粥样硬化和死亡风险,并识别诊断不足的个体。美国国立卫生研究院。
更新日期:2022-12-20
down
wechat
bug