Disease phenotype synonymous prediction through network representation learning from PubMed database.,Artificial Intelligence in Medicine

当前位置： X-MOL 学术 › Artif. Intell. Med. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Disease phenotype synonymous prediction through network representation learning from PubMed database.
Artificial Intelligence in Medicine ( IF 6.1 ) Pub Date : 2019-11-19 , DOI: 10.1016/j.artmed.2019.101745
Shiwen Ma ₁ , Kuo Yang ₁ , Ning Wang ₁ , Qiang Zhu ₁ , Zhuye Gao ₂ , Runshun Zhang ₃ , Baoyan Liu ₄ , Xuezhong Zhou ₁

Affiliation

Synonym mapping between phenotype concepts from different terminologies is difficult because terminology databases have been developed largely independently. Existing maps of synonymous phenotype concepts from different terminology databases are highly incomplete, and manually mapping is time consuming and laborious. Therefore, building an automatic method for predictive mapping of synonymous phenotypes is of special importance. We propose a classifier-based phenotype mapping prediction model (CPM) to predict synonymous relationships between phenotype concepts from different terminology databases. The model takes network semantic representations of phenotypes as input and predicts synonymous relationships by training binary classifiers with a voting strategy. We compared the performance of the CPM with a similarity-based phenotype mapping prediction model (SPM), which predicts mapping based on the ranked cosine similarity of candidate mapping concepts. Based on a network representation N2V-TFIDF, with a majority voting strategy method MV, the CPM achieved accuracy of 0.943, which was 15.4% higher than that of the SPM using the cosine similarity method (0.789) and 23.8% higher than that of the SSDTM method (0.724) proposed in our previous work.

中文翻译：

通过从PubMed数据库中学习网络表示，疾病表型同义预测。

来自不同术语的表型概念之间的同义词映射非常困难，因为术语数据库已在很大程度上独立开发。来自不同术语数据库的同义表型概念的现有映射非常不完整，并且手动映射既费时又费力。因此，建立一种自动预测同义表型的方法特别重要。我们提出了一个基于分类器的表型映射预测模型（CPM），以预测来自不同术语数据库的表型概念之间的同义关系。该模型将表型的网络语义表示作为输入，并通过使用投票策略训练二进制分类器来预测同义关系。我们将CPM的性能与基于相似度的表型映射预测模型（SPM）进行了比较，该模型基于候选映射概念的排名余弦相似度来预测映射。基于网络表示N2V-TFIDF，采用多数投票策略方法MV，CPM的精度为0.943，比使用余弦相似度方法（0.789）的SPM精度高15.4％，比SPM的精度高23.8％。在我们以前的工作中提出了SSDTM方法（0.724）。

更新日期：2019-11-19

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11