当前位置: X-MOL 学术J. Am. Med. Inform. Assoc. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
The 2019 National Natural language processing (NLP) Clinical Challenges (n2c2)/Open Health NLP (OHNLP) shared task on clinical concept normalization for clinical records.
Journal of the American Medical Informatics Association ( IF 6.4 ) Pub Date : 2020-09-24 , DOI: 10.1093/jamia/ocaa106
Sam Henry 1 , Yanshan Wang 2 , Feichen Shen 2 , Ozlem Uzuner 1, 3, 4
Affiliation  

Abstract
Objective
The 2019 National Natural language processing (NLP) Clinical Challenges (n2c2)/Open Health NLP (OHNLP) shared task track 3, focused on medical concept normalization (MCN) in clinical records. This track aimed to assess the state of the art in identifying and matching salient medical concepts to a controlled vocabulary. In this paper, we describe the task, describe the data set used, compare the participating systems, present results, identify the strengths and limitations of the current state of the art, and identify directions for future research.
Materials and Methods
Participating teams were provided with narrative discharge summaries in which text spans corresponding to medical concepts were identified. This paper refers to these text spans as mentions. Teams were tasked with normalizing these mentions to concepts, represented by concept unique identifiers, within the Unified Medical Language System. Submitted systems represented 4 broad categories of approaches: cascading dictionary matching, cosine distance, deep learning, and retrieve-and-rank systems. Disambiguation modules were common across all approaches.
Results
A total of 33 teams participated in the MCN task. The best-performing team achieved an accuracy of 0.8526. The median and mean performances among all teams were 0.7733 and 0.7426, respectively.
Conclusions
Overall performance among the top 10 teams was high. However, several mention types were challenging for all teams. These included mentions requiring disambiguation of misspelled words, acronyms, abbreviations, and mentions with more than 1 possible semantic type. Also challenging were complex mentions of long, multi-word terms that may require new ways of extracting and representing mention meaning, the use of domain knowledge, parse trees, or hand-crafted rules.


中文翻译:

2019 年国家自然语言处理 (NLP) 临床挑战 (n2c2)/开放健康 NLP (OHNLP) 共享临床记录临床概念规范化任务。

摘要
客观的
2019 年国家自然语言处理 (NLP) 临床挑战 (n2c2)/开放健康 NLP (OHNLP) 共享任务轨道 3,专注于临床记录中的医学概念规范化 (MCN)。本课程旨在评估识别重要医学概念并将其与受控词汇相匹配的最新技术。在本文中,我们描述了任务,描述了使用的数据集,比较了参与的系统,展示了结果,确定了当前最先进技术的优势和局限性,并确定了未来研究的方向。
材料和方法
参与团队获得了叙述性出院摘要,其中确定了与医学概念相对应的文本跨度。本文将这些文本跨度称为提及。团队的任务是在统一医学语言系统中将这些提及标准化为概念,由概念唯一标识符表示。提交的系统代表了 4 大类方法:级联字典匹配、余弦距离、深度学习和检索和排序系统。消歧模块在所有方法中都很常见。
结果
共有 33 个团队参与了 MCN 任务。表现最好的团队达到了 0.8526 的准确率。所有球队的平均表现分别为 0.7733 和 0.7426。
结论
前10名球队的整体表现很高。但是,有几种提及类型对所有团队都具有挑战性。其中包括需要消除拼写错误的单词、首字母缩略词、缩写词以及具有 1 种以上可能语义类型的提及项。同样具有挑战性的是对冗长的多词术语的复杂提及,这可能需要新的方法来提取和表示提及的含义、领域知识的使用、解析树或手工制定的规则。
更新日期:2020-10-16
down
wechat
bug