当前位置: X-MOL 学术arXiv.cs.CL › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Learning Informative Representations of Biomedical Relations with Latent Variable Models
arXiv - CS - Computation and Language Pub Date : 2020-11-20 , DOI: arxiv-2011.10285
Harshil Shah, Julien Fauqueur

Extracting biomedical relations from large corpora of scientific documents is a challenging natural language processing task. Existing approaches usually focus on identifying a relation either in a single sentence (mention-level) or across an entire corpus (pair-level). In both cases, recent methods have achieved strong results by learning a point estimate to represent the relation; this is then used as the input to a relation classifier. However, the relation expressed in text between a pair of biomedical entities is often more complex than can be captured by a point estimate. To address this issue, we propose a latent variable model with an arbitrarily flexible distribution to represent the relation between an entity pair. Additionally, our model provides a unified architecture for both mention-level and pair-level relation extraction. We demonstrate that our model achieves results competitive with strong baselines for both tasks while having fewer parameters and being significantly faster to train. We make our code publicly available.

中文翻译:

用潜在变量模型学习生物医学关系的信息表​​示

从大量的科学文献中提取生物医学关系是一项具有挑战性的自然语言处理任务。现有的方法通常着重于在单个句子(提及级别)或整个语料库(对级别)中识别关系。在这两种情况下,最近的方法都通过学习表示该关系的点估计而获得了不错的结果。然后将其用作关系分类器的输入。但是,一对生物医学实体之间以文本表示的关系通常比点估计所能捕获的关系更为复杂。为了解决这个问题,我们提出了一个具有任意灵活分布的潜在变量模型来表示实体对之间的关​​系。此外,我们的模型为提述级别和成对级别的关系提取提供了统一的体系结构。我们证明了我们的模型可以在两个任务都获得强大基线的情况下获得具有竞争力的结果,同时参数更少且训练速度明显更快。我们使我们的代码公开可用。
更新日期:2020-11-23
down
wechat
bug