当前位置: X-MOL 学术Inform. Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Joint Extraction of Entities and Relations via an Entity Correlated Attention Neural Model
Information Sciences ( IF 8.1 ) Pub Date : 2021-09-15 , DOI: 10.1016/j.ins.2021.09.028
Ren Li 1 , Dong Li 1 , Jianxi Yang 1 , Fangyue Xiang 1 , Hao Ren 1 , Shixin Jiang 1 , Luyi Zhang 1
Affiliation  

Named entity recognition and relation extraction are crucial tasks in natural language processing. As the traditional pipelined manners may suffer from the error propagation issue and ignore underlying interactions, joint extraction of entities and relations has become the dominant trend. However, the performance of existing joint extraction models needs improvement. This paper presents a two-stage tagging scheme that separately labels candidate head entities and multiple tail entities in specific relations. Next, it proposes a novel lightweight joint extraction neural model based on the entity-first labeling strategy. In the proposed model, the BiLSTM-based encoder combines the hidden state and global context features and feeds them as input for the next two entity labeling tasks. Further, with the input of the mixed context representation, the candidate-head-entity recognition module is adopted to identify the candidate head entity, while the multiple-tail-entities recognition module is equipped with an entity-correlated attention mechanism to identify the corresponding tail entity under a specific head entity. Comprehensive experiments on two widely used English datasets and one self-constructed Chinese dataset were performed. The experimental results showed that the proposed model outperformed the baseline approaches in the relation extraction task and achieved a competitive entity recognition effect via a lightweight architecture.



中文翻译:

通过实体相关注意神经模型联合提取实体和关系

命名实体识别和关系提取是自然语言处理中的关键任务。由于传统的流水线方式可能会受到错误传播问题的影响并忽略潜在的交互,因此实体和关系的联合提取已成为主导趋势。然而,现有联合提取模型的性能需要改进。本文提出了一种两阶段标记方案,分别标记特定关系中的候选头部实体和多个尾部实体。接下来,它提出了一种基于实体优先标记策略的新型轻量级联合提取神经模型。在提出的模型中,基于 BiLSTM 的编码器结合了隐藏状态和全局上下文特征,并将它们作为接下来两个实体标记任务的输入。此外,随着混合上下文表示的输入,采用候选头实体识别模块识别候选头实体,而多尾实体识别模块配备实体关联注意力机制,识别特定头实体下对应的尾实体。在两个广泛使用的英文数据集和一个自建中文数据集上进行了综合实验。实验结果表明,所提出的模型在关系提取任务中优于基线方法,并通过轻量级架构实现了竞争实体识别效果。而多尾实体识别模块则配备了实体相关的注意力机制,以识别特定头部实体下对应的尾实体。在两个广泛使用的英文数据集和一个自建中文数据集上进行了综合实验。实验结果表明,所提出的模型在关系提取任务中优于基线方法,并通过轻量级架构实现了竞争实体识别效果。而多尾实体识别模块则配备了实体相关的注意力机制,以识别特定头部实体下对应的尾实体。在两个广泛使用的英文数据集和一个自建中文数据集上进行了综合实验。实验结果表明,所提出的模型在关系提取任务中优于基线方法,并通过轻量级架构实现了竞争实体识别效果。

更新日期:2021-09-15
down
wechat
bug