当前位置:
X-MOL 学术
›
arXiv.cs.CL
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Slot Filling for Biomedical Information Extraction
arXiv - CS - Computation and Language Pub Date : 2021-09-17 , DOI: arxiv-2109.08564 Yannis Papanikolaou, Francine Bennett
arXiv - CS - Computation and Language Pub Date : 2021-09-17 , DOI: arxiv-2109.08564 Yannis Papanikolaou, Francine Bennett
Information Extraction (IE) from text refers to the task of extracting
structured knowledge from unstructured text. The task typically consists of a
series of sub-tasks such as Named Entity Recognition and Relation Extraction.
Sourcing entity and relation type specific training data is a major bottleneck
in the above sub-tasks.In this work we present a slot filling approach to the
task of biomedical IE, effectively replacing the need for entity and
relation-specific training data, allowing to deal with zero-shot settings. We
follow the recently proposed paradigm of coupling a Tranformer-based
bi-encoder, Dense Passage Retrieval, with a Transformer-based reader model to
extract relations from biomedical text. We assemble a biomedical slot filling
dataset for both retrieval and reading comprehension and conduct a series of
experiments demonstrating that our approach outperforms a number of simpler
baselines. We also evaluate our approach end-to-end for standard as well as
zero-shot settings. Our work provides a fresh perspective on how to solve
biomedical IE tasks, in the absence of relevant training data. Our code, models
and pretrained data are available at
https://github.com/healx/biomed-slot-filling.
中文翻译:
用于生物医学信息提取的槽填充
从文本中提取信息(IE)是指从非结构化文本中提取结构化知识的任务。该任务通常由命名实体识别和关系提取等一系列子任务组成。采购实体和关系类型特定的训练数据是上述子任务的主要瓶颈。处理零镜头设置。我们遵循最近提出的范式,将基于 Transformer 的双编码器、Dense Passage Retrieval 与基于 Transformer 的阅读器模型耦合,以从生物医学文本中提取关系。我们为检索和阅读理解组装了一个生物医学插槽填充数据集,并进行了一系列实验,证明我们的方法优于许多更简单的基线。我们还评估了我们的标准和零样本设置的端到端方法。我们的工作为如何在缺乏相关训练数据的情况下解决生物医学 IE 任务提供了新的视角。我们的代码、模型和预训练数据可在 https://github.com/healx/biomed-slot-filling 获得。
更新日期:2021-09-20
中文翻译:
用于生物医学信息提取的槽填充
从文本中提取信息(IE)是指从非结构化文本中提取结构化知识的任务。该任务通常由命名实体识别和关系提取等一系列子任务组成。采购实体和关系类型特定的训练数据是上述子任务的主要瓶颈。处理零镜头设置。我们遵循最近提出的范式,将基于 Transformer 的双编码器、Dense Passage Retrieval 与基于 Transformer 的阅读器模型耦合,以从生物医学文本中提取关系。我们为检索和阅读理解组装了一个生物医学插槽填充数据集,并进行了一系列实验,证明我们的方法优于许多更简单的基线。我们还评估了我们的标准和零样本设置的端到端方法。我们的工作为如何在缺乏相关训练数据的情况下解决生物医学 IE 任务提供了新的视角。我们的代码、模型和预训练数据可在 https://github.com/healx/biomed-slot-filling 获得。