当前位置: X-MOL 学术J. Am. Med. Inform. Assoc. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Cross-lingual Unified Medical Language System entity linking in online health communities.
Journal of the American Medical Informatics Association ( IF 6.4 ) Pub Date : 2020-09-10 , DOI: 10.1093/jamia/ocaa150
Yonatan Bitton 1 , Raphael Cohen 1 , Tamar Schifter 2 , Eitan Bachmat 1 , Michael Elhadad 1 , Noémie Elhadad 3
Affiliation  

Abstract
Objective
In Hebrew online health communities, participants commonly write medical terms that appear as transliterated forms of a source term in English. Such transliterations introduce high variability in text and challenge text-analytics methods. To reduce their variability, medical terms must be normalized, such as linking them to Unified Medical Language System (UMLS) concepts. We present a method to identify both transliterated and translated Hebrew medical terms and link them with UMLS entities.
Materials and Methods
We investigate the effect of linking terms in Camoni, a popular Israeli online health community in Hebrew. Our method, MDTEL (Medical Deep Transliteration Entity Linking), includes (1) an attention-based recurrent neural network encoder-decoder to transliterate words and mapping UMLS from English to Hebrew, (2) an unsupervised method for creating a transliteration dataset in any language without manually labeled data, and (3) an efficient way to identify and link medical entities in the Hebrew corpus to UMLS concepts, by producing a high-recall list of candidate medical terms in the corpus, and then filtering the candidates to relevant medical terms.
Results
We carry out experiments on 3 disease-specific communities: diabetes, multiple sclerosis, and depression. MDTEL tagging and normalizing on Camoni posts achieved 99% accuracy, 92% recall, and 87% precision. When tagging and normalizing terms in queries from the Camoni search logs, UMLS-normalized queries improved search results in 46% of the cases.
Conclusions
Cross-lingual UMLS entity linking from Hebrew is possible and improves search performance across communities. Annotated datasets, annotation guidelines, and code are made available online (https://github.com/yonatanbitton/mdtel).


中文翻译:

在线医疗社区中的跨语言统一医学语言系统实体链接。

摘要
目的
在希伯来语在线健康社区中,参与者通常会写医学术语,这些术语以英语的源术语的音译形式出现。这种音译在文本中引入了高度可变性,并挑战了文本分析方法。为了减少其可变性,必须对医学术语进行规范化,例如将它们链接到统一医学语言系统(UMLS)概念。我们提出了一种识别音译和翻译的希伯来医学术语并将其与UMLS实体链接的方法。
材料和方法
我们调查在希伯来语中以色列受欢迎的在线健康社区Camoni中链接术语的影响。我们的方法MDTEL(医学深度音译实体链接)包括(1)基于注意力的递归神经网络编码器/解码器,以音译单词并将UMLS从英语映射到希伯来语;(2)在任何情况下创建音译数据集的无监督方法无需手动标记数据的语言,以及(3)通过在语料库中生成候选医学术语的高召回列表,然后将候选对象过滤到相关医学,来将希伯来语语料库中的医疗实体识别和链接到UMLS概念的有效方法条款。
结果
我们对3种特定疾病的社区进行了实验:糖尿病,多发性硬化症和抑郁症。在Camoni帖子上进行MDTEL标记和规范化可实现99%的准确性,92%的召回率和87%的准确性。在对Camoni搜索日志中的查询中的术语进行标记和规范化时,UMLS规范化的查询可改善46%的情况下的搜索结果。
结论
来自希伯来语的跨语言UMLS实体链接是可能的,并提高了跨社区的搜索性能。带注释的数据集,注释准则和代码可在线获得(https://github.com/yonatanbitton/mdtel)。
更新日期:2020-10-16
down
wechat
bug