当前位置: X-MOL 学术 › Int J Med Inform › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Automatic trial eligibility surveillance based on unstructured clinical data.
International journal of medical informatics Pub Date : 2019-05-23 , DOI: 10.1016/j.ijmedinf.2019.05.018
Stéphane M Meystre 1 , Paul M Heider 2 , Youngjun Kim 2 , Daniel B Aruch 3 , Carolyn D Britten 3
Affiliation  

INTRODUCTION Insufficient patient enrollment in clinical trials remains a serious and costly problem and is often considered the most critical issue to solve for the clinical trials community. In this project, we assessed the feasibility of automatically detecting a patient's eligibility for a sample of breast cancer clinical trials by mapping coded clinical trial eligibility criteria to the corresponding clinical information automatically extracted from text in the EHR. METHODS Three open breast cancer clinical trials were selected by oncologists. Their eligibility criteria were manually abstracted from trial descriptions using the OHDSI ATLAS web application. Patients enrolled or screened for these trials were selected as 'positive' or 'possible' cases. Other patients diagnosed with breast cancer were selected as 'negative' cases. A selection of the clinical data and all clinical notes of these 229 selected patients was extracted from the MUSC clinical data warehouse and stored in a database implementing the OMOP common data model. Eligibility criteria were extracted from clinical notes using either manually crafted pattern matching (regular expressions) or a new natural language processing (NLP) application. These extracted criteria were then compared with reference criteria from trial descriptions. This comparison was realized with three different versions of a new application: rule-based, cosine similarity-based, and machine learning-based. RESULTS For eligibility criteria extraction from clinical notes, the machine learning-based NLP application allowed for the highest accuracy with a micro-averaged recall of 90.9% and precision of 89.7%. For trial eligibility determination, the highest accuracy was reached by the machine learning-based approach with a per-trial AUC between 75.5% and 89.8%. CONCLUSION NLP can be used to extract eligibility criteria from EHR clinical notes and automatically discover patients possibly eligible for a clinical trial with good accuracy, which could be leveraged to reduce the workload of humans screening patients for trials.

中文翻译:

基于非结构化临床数据的自动试验资格监测。

引言 临床试验中患者入组不足仍然是一个严重且代价高昂的问题,通常被认为是临床试验界需要解决的最关键问题。在这个项目中,我们通过将编码的临床试验资格标准映射到从 EHR 文本中自动提取的相应临床信息,评估了自动检测患者是否符合乳腺癌临床试验样本的可行性。方法 肿瘤学家选择了三项开放的乳腺癌临床试验。他们的资格标准是使用 OHDSI ATLAS 网络应用程序从试验描述中手动提取的。为这些试验登记或筛选的患者被选为“阳性”或“可能”病例。其他被诊断患有乳腺癌的患者被选为' 负面'的情况。从 MUSC 临床数据仓库中提取了这 229 名选定患者的临床数据和所有临床记录,并存储在实施 OMOP 通用数据模型的数据库中。使用手动模式匹配(正则表达式)或新的自然语言处理(NLP)应用程序从临床记录中提取资格标准。然后将这些提取的标准与试验描述中的参考标准进行比较。这种比较是通过三个不同版本的新应用程序实现的:基于规则的、基于余弦相似度的和基于机器学习的。结果 对于从临床记录中提取的资格标准,基于机器学习的 NLP 应用程序允许最高精度,微平均召回率为 90.9%,精度为 89.7%。对于试验资格确定,基于机器学习的方法达到了最高准确度,每次试验的 AUC 介于 75.5% 和 89.8% 之间。结论 NLP 可用于从 EHR 临床记录中提取资格标准,并自动发现可能有资格参加临床试验的患者,具有良好的准确性,可用于减少人类筛选患者进行试验的工作量。
更新日期:2019-11-01
down
wechat
bug