Extracting Biomedical Entity Relations using Biological Interaction Knowledge,Interdisciplinary Sciences: Computational Life Sciences

当前位置： X-MOL 学术 › Interdiscip. Sci. Comput. Life Sci. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Extracting Biomedical Entity Relations using Biological Interaction Knowledge
Interdisciplinary Sciences: Computational Life Sciences ( IF 4.8 ) Pub Date : 2021-03-17 , DOI: 10.1007/s12539-021-00425-8
Shuyu Guo _{1,

2} , Lan Huang _{1,

2} , Gang Yao ₃ , Ye Wang _{1,

2} , Haotian Guan _{1,

2} , Tian Bai _{1,

2}

Affiliation

Discovering relations of cross-type biomedical entities is crucial for biology research. A large amount of potential or indirect connected biological relations is hidden in millions of biomedical literatures and biological databases. The previous rules-based and deep learning approaches rely on plenty of manual annotations, which is laborious, time-consuming and unsatisfactory. It is necessary to be able to combine available annotated gene databases, chemical, genomic, clinical and other types of data repositories as domain knowledge to assist the extraction of biological entity relations from numerous literatures. Under this scenario, this paper proposes BioGraphSAGE model, a Siamese graph neural network with structured databases as domain knowledge to extract biological entity relations from literatures. Our model combines both biological semantic features and positional features to improve the recognition of relations between distant entities in the same literature. The experiment results show that BioGraphSAGE achieves the best F1 score among other relation extraction models on smaller annotated samples. Moreover, the proposed model can still maintain a F1 score of 0.526 without using annotated training samples.

中文翻译：

使用生物交互知识提取生物医学实体关系

发现跨类型生物医学实体的关系对于生物学研究至关重要。数以百万计的生物医学文献和生物数据库中隐藏着大量潜在或间接关联的生物学关系。以前的基于规则的深度学习方法依赖于大量的手动注释，费力、费时且不能令人满意。有必要能够将可用的注释基因数据库、化学、基因组、临床和其他类型的数据存储库作为领域知识结合起来，以帮助从众多文献中提取生物实体关系。在这种情况下，本文提出了 BioGraphSAGE 模型，一种以结构化数据库为领域知识的连体图神经网络，用于从文献中提取生物实体关系。我们的模型结合了生物语义特征和位置特征，以提高对同一文献中远距离实体之间关系的识别。实验结果表明，BioGraphSAGE 在较小的标注样本上取得了其他关系提取模型中最好的 F1 分数。此外，在不使用带注释的训练样本的情况下，所提出的模型仍然可以保持 0.526 的 F1 分数。

更新日期：2021-03-17

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>