Dual Supervision Framework for Relation Extraction with Distant Supervision and Human Annotation,arXiv - CS - Computation and Language

当前位置： X-MOL 学术 › arXiv.cs.CL › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Dual Supervision Framework for Relation Extraction with Distant Supervision and Human Annotation
arXiv - CS - Computation and Language Pub Date : 2020-11-24 , DOI: arxiv-2011.11851
Woohwan Jung, Kyuseok Shim

Relation extraction (RE) has been extensively studied due to its importance in real-world applications such as knowledge base construction and question answering. Most of the existing works train the models on either distantly supervised data or human-annotated data. To take advantage of the high accuracy of human annotation and the cheap cost of distant supervision, we propose the dual supervision framework which effectively utilizes both types of data. However, simply combining the two types of data to train a RE model may decrease the prediction accuracy since distant supervision has labeling bias. We employ two separate prediction networks HA-Net and DS-Net to predict the labels by human annotation and distant supervision, respectively, to prevent the degradation of accuracy by the incorrect labeling of distant supervision. Furthermore, we propose an additional loss term called disagreement penalty to enable HA-Net to learn from distantly supervised labels. In addition, we exploit additional networks to adaptively assess the labeling bias by considering contextual information. Our performance study on sentence-level and document-level REs confirms the effectiveness of the dual supervision framework.

中文翻译：

具有远距离监督和人工注释关系提取的双重监督框架

由于关系提取（RE）在现实世界中的应用（例如知识库构建和问题解答）中的重要性，因此已进行了广泛的研究。现有的大多数工作都使用远程监督的数据或人工注释的数据来训练模型。为了利用人类注释的高精度和远距离监管的廉价成本，我们提出了一种有效利用两种类型数据的双重监管框架。但是，由于远程监管具有标签偏差，因此仅将两种类型的数据组合以训练RE模型可能会降低预测准确性。我们分别使用两个单独的预测网络HA-Net和DS-Net通过人工注释和远程监督来预测标签，以防止由于远程监督的标签错误而导致准确性下降。此外，我们提出了一个额外的损失条款，称为“异议惩罚”，以使HA-Net能够从远程监管的标签中学习。此外，我们通过考虑上下文信息来利用其他网络来自适应地评估标签偏差。我们对句子级和文档级RE的性能研究证实了双重监管框架的有效性。

更新日期：2020-11-25

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>