当前位置: X-MOL 学术BMC Med. Inform. Decis. Mak. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Combining entity co-occurrence with specialized word embeddings to measure entity relation in Alzheimer's disease.
BMC Medical Informatics and Decision Making ( IF 3.3 ) Pub Date : 2019-12-05 , DOI: 10.1186/s12911-019-0934-5
Go Eun Heo 1 , Qing Xie 1 , Min Song 1 , Jeong-Hoon Lee 2
Affiliation  

BACKGROUND Extracting useful information from biomedical literature plays an important role in the development of modern medicine. In natural language processing, there have been rigorous attempts to find meaningful relationships between entities automatically by co-occurrence-based methods. It has been increasingly important to understand whether relationships exist, and if so how strong, between any two entities extracted from a large number of texts. One of the defining methods is to measure semantic similarity and relatedness between two entities. METHODS We propose a hybrid ranking method that combines a co-occurrence approach considering both direct and indirect entity pair relationship with specialized word embeddings for measuring the relatedness of two entities. RESULTS We evaluate the proposed ranking method comparatively with other well-known methods such as co-occurrence, Word2Vec, COALS (Correlated Occurrence Analog to Lexical Semantics), and random indexing by calculating top-ranked entities related to Alzheimer's disease. In addition, we analyze gene, pathway, and gene-phenotype relationships. Overall, the proposed method tends to find more hidden relationships than the other methods. CONCLUSION Our proposed method is able to select more useful related entities that not only highly co-occur but also have more indirect relations for the target entity. In pathway analysis, our proposed method shows superior performance at identifying (functional) cross clustering and higher-level pathways. Our proposed method, resulting from phenotype analysis, has an advantage in identifying the common genotype relating to phenotypes from biological literature.

中文翻译:

将实体共现与专门的单词嵌入相结合,以衡量阿尔茨海默氏病中的实体关系。

背景技术从生物医学文献中提取有用的信息在现代医学的发展中起着重要的作用。在自然语言处理中,已经进行了严格的尝试,以基于共现的方法自动找到实体之间的有意义的关系。理解从大量文本中提取的任意两个实体之间是否存在关系,以及关系是否牢固,这一点变得越来越重要。定义方法之一是测量两个实体之间的语义相似性和相关性。方法我们提出了一种混合排序方法,该方法将同时考虑直接和间接实体对关系的共现方法与专门的词嵌入相结合,以测量两个实体的相关性。结果我们通过与其他同时发生,Word2Vec,COALS(类似于词法语义的关联发生)和随机索引(通过计算与阿尔茨海默氏病相关的排名最高的实体)进行比较,对提议的排名方法进行了比较评估。此外,我们分析基因,途径和基因表型之间的关系。总体而言,与其他方法相比,所提出的方法倾向于发现更多的隐藏关系。结论我们提出的方法能够选择更多有用的相关实体,这些实体不仅高度共现,而且与目标实体具有更多的间接关系。在途径分析中,我们提出的方法在识别(功能性)交叉聚类和更高水平的途径方面显示出优异的性能。根据表型分析,我们提出的方法
更新日期:2019-12-05
down
wechat
bug