A new model for coreference resolution based on knowledge representation and multi-criteria ranking,Journal of Intelligent & Fuzzy Systems

当前位置： X-MOL 学术 › J. Intell. Fuzzy Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A new model for coreference resolution based on knowledge representation and multi-criteria ranking
Journal of Intelligent & Fuzzy Systems ( IF 1.7 ) Pub Date : 2020-11-05 , DOI: 10.3233/jifs-201050
Samira Hourali ₁ , Morteza Zahedi ₁ , Mansour Fateh ₁

Affiliation

Coreference resolution is critical for improving the performance of all text-based systems including information extraction, document summarization, machine translation, and question-answering. Most of coreference resolution solutions rely on using knowledge resources like lexical knowledge, syntactic knowledge, world knowledge and semantic knowledge. This paper presents a new knowledge-based coreference resolution model using neural network architecture. It uses XLNet embeddings as input and does not rely on any syntactic or dependency parsers. For more efficient span representation and mention detection, we used entity-level information. Mentions were extracted from the text with an unhand engineered mention detector, and the features were extracted from a deep neural network. We also propose a nonlinear multi-criteria ranking model to rank the candidate antecedents. This model simultaneously determines the total score of alternatives and the weight of the features in order to speed up the process of ranking alternatives. Compared to the state-of-the-art models, the simulation results showed significant improvements on the English CoNLL-2012 shared task (+6.4 F1). Moreover, we achieved 96.1% F1 score on the n2c2 medical dataset.

中文翻译：

基于知识表示和多准则排序的共指解析新模型

共指解析对于提高所有基于文本的系统（包括信息提取，文档摘要，机器翻译和问题解答）的性能至关重要。大多数共指解析解决方案都依赖于使用诸如词汇知识，句法知识，世界知识和语义知识之类的知识资源。本文提出了一种使用神经网络架构的新的基于知识的共指解析模型。它使用XLNet嵌入作为输入，并且不依赖任何语法或依赖项解析器。为了更有效地表示跨度和提及检测，我们使用了实体级信息。使用非人工设计的提及检测器从文本中提取提及，并从深度神经网络提取特征。我们还提出了一个非线性的多标准排序模型来对候选先决条件进行排序。该模型同时确定替代方案的总分和特征的权重，以加快对替代方案进行排名的过程。与最新模型相比，仿真结果显示英语CoNLL-2012共享任务（+6.4 F1）有了显着改进。此外，我们在n2c2医疗数据集上获得了96.1％的F1分数。

更新日期：2020-11-06

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11