当前位置: X-MOL 学术Front. Comput. Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Nested relation extraction with iterative neural network
Frontiers of Computer Science ( IF 4.2 ) Pub Date : 2021-01-16 , DOI: 10.1007/s11704-020-9420-6
Yixuan Cao , Dian Chen , Zhengqi Xu , Hongwei Li , Ping Luo

Most existing researches on relation extraction focus on binary flat relations like BornIn relation between a Person and a Location. But a large portion of objective facts described in natural language are complex, especially in professional documents in fields such as finance and biomedicine that require precise expressions. For example, “the GDP of the United States in 2018 grew 2.9% compared with 2017” describes a growth rate relation between two other relations about the economic index, which is beyond the expressive power of binary flat relations. Thus, we propose the nested relation extraction problem and formulate it as a directed acyclic graph (DAG) structure extraction problem. Then, we propose a solution using the Iterative Neural Network which extracts relations layer by layer. The proposed solution achieves 78.98 and 97.89 F1 scores on two nested relation extraction tasks, namely semantic cause-and-effect relation extraction and formula extraction. Furthermore, we observe that nested relations are usually expressed in long sentences where entities are mentioned repetitively, which makes the annotation difficult and error-prone. Hence, we extend our model to incorporate a mention-insensitive mode that only requires annotations of relations on entity concepts (instead of exact mentions) while preserving most of its performance. Our mention-insensitive model performs better than the mention sensitive model when the random level in mention selection is higher than 0.3.



中文翻译:

迭代神经网络的嵌套关系提取

有关关系提取的大多数现有研究都集中于二进制平面关系,例如人与位置之间的BornIn关系。但是,用自然语言描述的大部分客观事实是复杂的,尤其是在金融和生物医学等领域中要求精确表达的专业文档中。例如,“ 2018年美国的GDP与2017年相比增长2.9%”描述了另外两个经济指数之间的增长率关系,这超出了二元固定关系的表达能力。因此,我们提出了嵌套关系提取问题,并将其表达为有向无环图(DAG)结构提取问题。然后,我们提出了使用迭代神经网络的解决方案,它逐层提取关系。提出的解决方案实现了78.98和97。89个F1在两个嵌套关系提取任务上得分,即语义因果关系提取和公式提取。此外,我们观察到嵌套关系通常用长句子表达,其中重复提及实体,这使得注释困难且容易出错。因此,我们扩展了模型,使其包含了不涉及提及的模式,该模式仅需要注释实体概念上的关系(而不是确切提及),同时保留其大部分性能。当提及选择中的随机级别高于0.3时,我们的提及不敏感模型的性能要优于提及敏感模型。我们观察到嵌套关系通常用长句子表达,其中重复提及实体,这使得注释困难且容易出错。因此,我们扩展了模型,使其包含了不涉及提及的模式,该模式仅需要注释实体概念上的关系(而不是确切提及),同时保留其大部分性能。当提及选择中的随机级别高于0.3时,我们的提及不敏感模型的性能要优于提及敏感模型。我们观察到嵌套关系通常用长句子表达,其中重复提及实体,这使得注释困难且容易出错。因此,我们扩展了模型,使其包含了不涉及提及的模式,该模式仅需要注释实体概念上的关系(而不是确切提及),同时保留其大部分性能。当提及选择中的随机级别高于0.3时,我们的提及不敏感模型的性能要优于提及敏感模型。

更新日期:2021-01-18
down
wechat
bug