当前位置: X-MOL 学术JAMA Cardiol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Natural Language Processing for Adjudication of Heart Failure in a Multicenter Clinical Trial: A Secondary Analysis of a Randomized Clinical Trial.
JAMA Cardiology ( IF 24.0 ) Pub Date : 2023-11-11 , DOI: 10.1001/jamacardio.2023.4859
Jonathan W Cunningham 1, 2 , Pulkit Singh 3 , Christopher Reeder 3 , Brian Claggett 1 , Pablo M Marti-Castellote 1 , Emily S Lau 2, 4 , Shaan Khurshid 2, 5 , Puneet Batra 3 , Steven A Lubitz 2, 5 , Mahnaz Maddah 3 , Anthony Philippakis 3 , Akshay S Desai 1 , Patrick T Ellinor 2, 5 , Orly Vardeny 6 , Scott D Solomon 1 , Jennifer E Ho 2, 7
Affiliation  

Importance The gold standard for outcome adjudication in clinical trials is medical record review by a physician clinical events committee (CEC), which requires substantial time and expertise. Automated adjudication of medical records by natural language processing (NLP) may offer a more resource-efficient alternative but this approach has not been validated in a multicenter setting. Objective To externally validate the Community Care Cohort Project (C3PO) NLP model for heart failure (HF) hospitalization adjudication, which was previously developed and tested within one health care system, compared to gold-standard CEC adjudication in a multicenter clinical trial. Design, Setting, and Participants This was a retrospective analysis of the Influenza Vaccine to Effectively Stop Cardio Thoracic Events and Decompensated Heart Failure (INVESTED) trial, which compared 2 influenza vaccines in 5260 participants with cardiovascular disease at 157 sites in the US and Canada between September 2016 and January 2019. Analysis was performed from November 2022 to October 2023. Exposures Individual sites submitted medical records for each hospitalization. The central INVESTED CEC and the C3PO NLP model independently adjudicated whether the cause of hospitalization was HF using the prepared hospitalization dossier. The C3PO NLP model was fine-tuned (C3PO + INVESTED) and a de novo NLP model was trained using half the INVESTED hospitalizations. Main Outcomes and Measures Concordance between the C3PO NLP model HF adjudication and the gold-standard INVESTED CEC adjudication was measured by raw agreement, κ, sensitivity, and specificity. The fine-tuned and de novo INVESTED NLP models were evaluated in an internal validation cohort not used for training. Results Among 4060 hospitalizations in 1973 patients (mean [SD] age, 66.4 [13.2] years; 514 [27.4%] female and 1432 [72.6%] male]), 1074 hospitalizations (26%) were adjudicated as HF by the CEC. There was good agreement between the C3PO NLP and CEC HF adjudications (raw agreement, 87% [95% CI, 86-88]; κ, 0.69 [95% CI, 0.66-0.72]). C3PO NLP model sensitivity was 94% (95% CI, 92-95) and specificity was 84% (95% CI, 83-85). The fine-tuned C3PO and de novo NLP models demonstrated agreement of 93% (95% CI, 92-94) and κ of 0.82 (95% CI, 0.77-0.86) and 0.83 (95% CI, 0.79-0.87), respectively, vs the CEC. CEC reviewer interrater reproducibility was 94% (95% CI, 93-95; κ, 0.85 [95% CI, 0.80-0.89]). Conclusions and Relevance The C3PO NLP model developed within 1 health care system identified HF events with good agreement relative to the gold-standard CEC in an external multicenter clinical trial. Fine-tuning the model improved agreement and approximated human reproducibility. Further study is needed to determine whether NLP will improve the efficiency of future multicenter clinical trials by identifying clinical events at scale.

中文翻译:

自然语言处理在多中心临床试验中判定心力衰竭:随机临床试验的二次分析。

重要性 临床试验结果判定的黄金标准是由医师临床事件委员会 (CEC) 审查病历,这需要大量的时间和专业知识。通过自然语言处理 (NLP) 对医疗记录进行自动裁决可能会提供一种更节省资源的替代方案,但这种方法尚未在多中心环境中得到验证。目的 对先前在一个医疗保健系统内开发和测试的心力衰竭 (HF) 住院裁决的社区护理队列项目 (C3PO) NLP 模型进行外部验证,并与多中心临床试验中的黄金标准 CEC 裁决进行比较。设计、设置和参与者 这是对流感疫苗有效阻止心胸事件和失代偿性心力衰竭 (INVESTED) 试验的回顾性分析,该试验比较了美国和加拿大 157 个地点的 5260 名患有心血管疾病的参与者的两种流感疫苗2016 年 9 月和 2019 年 1 月。分析时间为 2022 年 11 月至 2023 年 10 月。暴露情况 各个中心提交了每次住院的医疗记录。中央投资的CEC和C3PO NLP模型使用准备好的住院档案独立判断住院原因是否为心力衰竭。对 C3PO NLP 模型进行了微调(C3PO + INVESTED),并使用一半的 INVESTED 住院数据训练了 de novo NLP 模型。主要成果和措施 C3PO NLP 模型 HF 裁决与金标准 INVESTED CEC 裁决之间的一致性通过原始一致性、κ、敏感性和特异性来衡量。经过微调和从头投资的 NLP 模型在未用于训练的内部验证队列中进行了评估。结果 在 1973 名住院患者的 4060 名患者中(平均 [SD] 年龄,66.4 [13.2] 岁;女性 514 名 [27.4%],男性 1432 名 [72.6%]),其中 1074 名住院患者 (26%) 被 CEC 判定为心力衰竭。C3PO NLP 和 CEC HF 裁决之间具有良好的一致性(原始一致性,87% [95% CI,86-88];κ,0.69 [95% CI,0.66-0.72])。C3PO NLP 模型的敏感性为 94% (95% CI, 92-95),特异性为 84% (95% CI, 83-85)。微调的 C3PO 模型和 de novo NLP 模型的一致性为 93% (95% CI, 92-94),κ 分别为 0.82 (95% CI, 0.77-0.86) 和 0.83 (95% CI, 0.79-0.87) ,与 CEC 相比。CEC 审阅者间重现性为 94%(95% CI,93-95;κ,0.85 [95% CI,0.80-0.89])。结论和相关性 在一项外部多中心临床试验中,1 个医疗保健系统内开发的 C3PO NLP 模型识别出的心力衰竭事件与金标准 CEC 具有良好的一致性。微调模型提高了一致性并接近人类的再现性。需要进一步研究来确定 NLP 是否会通过大规模识别临床事件来提高未来多中心临床试验的效率。
更新日期:2023-11-11
down
wechat
bug