Integrating Speculation Detection and Deep Learning to Extract Lung Cancer Diagnosis from Clinical Notes,Applied Sciences

当前位置： X-MOL 学术 › Appl. Sci. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Integrating Speculation Detection and Deep Learning to Extract Lung Cancer Diagnosis from Clinical Notes
Applied Sciences ( IF 2.838 ) Pub Date : 2021-01-19 , DOI: 10.3390/app11020865
Oswaldo Solarte Pabón , Maria Torrente , Mariano Provencio , Alejandro Rodríguez-Gonzalez , Ernestina Menasalvas

Despite efforts to develop models for extracting medical concepts from clinical notes, there are still some challenges in particular to be able to relate concepts to dates. The high number of clinical notes written for each single patient, the use of negation, speculation, and different date formats cause ambiguity that has to be solved to reconstruct the patient’s natural history. In this paper, we concentrate on extracting from clinical narratives the cancer diagnosis and relating it to the diagnosis date. To address this challenge, a hybrid approach that combines deep learning-based and rule-based methods is proposed. The approach integrates three steps: (i) lung cancer named entity recognition, (ii) negation and speculation detection, and (iii) relating the cancer diagnosis to a valid date. In particular, we apply the proposed approach to extract the lung cancer diagnosis and its diagnosis date from clinical narratives written in Spanish. Results obtained show an F-score of 90% in the named entity recognition task, and a 89% F-score in the task of relating the cancer diagnosis to the diagnosis date. Our findings suggest that speculation detection is together with negation detection a key component to properly extract cancer diagnosis from clinical notes.

中文翻译：

结合推测检测和深度学习从临床笔记中提取肺癌诊断

尽管努力开发用于从临床记录中提取医学概念的模型，但是要使概念与日期相关联仍然存在一些挑战。为每位患者编写的大量临床笔记，使用否定，推测和不同的日期格式会导致模棱两可，必须加以解决才能重建患者的自然病史。在本文中，我们集中于从临床叙述中提取癌症诊断并将其与诊断日期联系起来。为了解决这一挑战，提出了一种将基于深度学习和基于规则的方法相结合的混合方法。该方法包括三个步骤：（i）肺癌，称为实体识别；（ii）否定和推测检测；（iii）将癌症诊断与有效日期联系起来。特别是，我们采用建议的方法从以西班牙语撰写的临床叙述中提取肺癌的诊断及其诊断日期。获得的结果显示，在命名实体识别任务中的F分数为90％，在将癌症诊断与诊断日期相关的任务中的F分数为89％。我们的发现表明，推测检测与否定检测是从临床记录中正确提取癌症诊断的关键组成部分。

更新日期：2021-01-19

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文

全部期刊列表>>