NERO: A Neural Rule Grounding Framework for Label-Efficient Relation Extraction,arXiv - CS - Computation and Language

当前位置： X-MOL 学术 › arXiv.cs.CL › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

NERO: A Neural Rule Grounding Framework for Label-Efficient Relation Extraction
arXiv - CS - Computation and Language Pub Date : 2019-09-05 , DOI: arxiv-1909.02177
Wenxuan Zhou, Hongtao Lin, Bill Yuchen Lin, Ziqi Wang, Junyi Du, Leonardo Neves, Xiang Ren

Deep neural models for relation extraction tend to be less reliable when perfectly labeled data is limited, despite their success in label-sufficient scenarios. Instead of seeking more instance-level labels from human annotators, here we propose to annotate frequent surface patterns to form labeling rules. These rules can be automatically mined from large text corpora and generalized via a soft rule matching mechanism. Prior works use labeling rules in an exact matching fashion, which inherently limits the coverage of sentence matching and results in the low-recall issue. In this paper, we present a neural approach to ground rules for RE, named NERO, which jointly learns a relation extraction module and a soft matching module. One can employ any neural relation extraction models as the instantiation for the RE module. The soft matching module learns to match rules with semantically similar sentences such that raw corpora can be automatically labeled and leveraged by the RE module (in a much better coverage) as augmented supervision, in addition to the exactly matched sentences. Extensive experiments and analysis on two public and widely-used datasets demonstrate the effectiveness of the proposed NERO framework, comparing with both rule-based and semi-supervised methods. Through user studies, we find that the time efficiency for a human to annotate rules and sentences are similar (0.30 vs. 0.35 min per label). In particular, NERO's performance using 270 rules is comparable to the models trained using 3,000 labeled sentences, yielding a 9.5x speedup. Moreover, NERO can predict for unseen relations at test time and provide interpretable predictions. We release our code to the community for future research.

中文翻译：

NERO：用于标签高效关系提取的神经规则基础框架

当完美标记的数据有限时，用于关系提取的深度神经模型往往不太可靠，尽管它们在标签充足的场景中取得了成功。与其从人类注释者那里寻求更多实例级标签，我们建议对频繁的表面模式进行注释以形成标签规则。这些规则可以从大型文本语料库中自动挖掘，并通过软规则匹配机制进行概括。先前的工作以精确匹配的方式使用标签规则，这从本质上限制了句子匹配的覆盖范围并导致低召回问题。在本文中，我们提出了一种名为 NERO 的 RE 基本规则的神经方法，它联合学习关系提取模块和软匹配模块。可以使用任何神经关系提取模型作为 RE 模块的实例。软匹配模块学习将规则与语义相似的句子进行匹配，这样除了完全匹配的句子之外，RE 模块可以自动标记和利用原始语料库（以更好的覆盖范围）作为增强监督。与基于规则和半监督方法相比，对两个公共和广泛使用的数据集进行的大量实验和分析证明了所提出的 NERO 框架的有效性。通过用户研究，我们发现人类注释规则和句子的时间效率相似（每个标签 0.30 与 0.35 分钟）。特别是，NERO 使用 270 条规则的性能与使用 3,000 个标记句子训练的模型相当，产生了 9.5 倍的加速。此外，NERO 可以在测试时预测看不见的关系并提供可解释的预测。

更新日期：2020-01-17

点击分享查看原文

点击收藏

阅读更多本刊最新论文