A deep neural network model for speakers coreference resolution in legal texts,Information Processing & Management

当前位置： X-MOL 学术 › Inf. Process. Manag. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A deep neural network model for speakers coreference resolution in legal texts
Information Processing & Management ( IF 7.4 ) Pub Date : 2020-08-20 , DOI: 10.1016/j.ipm.2020.102365
Donghong Ji , Jun Gao , Hao Fei , Chong Teng , Yafeng Ren

Coreference resolution is one of the fundamental tasks in natural language processing (NLP), and is of great significance to understand the semantics of texts. Meanwhile, resolving coreference is essential for many NLP downstream applications. Existing methods largely focus on pronouns, possessives and noun phrases resolution in the general domain, while little work is proposed for professional domains such as the legal field. Different from general texts, how to code legal texts and capture the relationship between entities in the text, and then resolve coreference is a challenging problem. For better understanding the legal text, and facilitating a series of downstream tasks in legal text mining, we propose a deep neural network model for coreference resolution in court record documents. Specifically, the pre-trained language model and bi-directional long short-term memory networks are first utilized to encode legal texts. Second, graph neural networks are applied to incorporate reference relations between entities. Finally, two distinct classifiers are used to score the candidate pairs. Results on the dataset show that our model achieves 87.53% F1 score on court record documents, outperforming neural baseline models by a large margin. Further analysis shows that the proposed method can effectively identify the reference relations between entities and model the entity dependencies.

中文翻译：

法律文本中说话人共指解析的深度神经网络模型

共指解析是自然语言处理（NLP）的基本任务之一，对理解文本的语义具有重要意义。同时，解析共指对于许多NLP下游应用而言至关重要。现有的方法主要集中在通用领域中的代词，所有格和名词短语解析，而对于诸如法律领域之类的专业领域，几乎没有提出任何建议。与普通文本不同，如何对法律文本进行编码并捕获文本中实体之间的关系，然后解决共指问题是一个具有挑战性的问题。为了更好地理解法律文本，并促进法律文本挖掘中的一系列下游任务，我们提出了一种深度神经网络模型，用于法院记录文件中的共同指称解决。特别，首先使用预训练的语言模型和双向长期短期记忆网络对合法文本进行编码。其次，应用图神经网络合并实体之间的引用关系。最后，使用两个不同的分类器对候选对进行评分。数据集上的结果表明，我们的模型在法庭记录文件上的F1得分达到87.53％，大大优于神经基线模型。进一步的分析表明，该方法可以有效识别实体之间的引用关系，并对实体依赖关系进行建模。数据集上的结果表明，我们的模型在法庭记录文件上的F1得分达到87.53％，大大优于神经基线模型。进一步的分析表明，该方法可以有效识别实体之间的引用关系，并对实体依赖关系进行建模。数据集上的结果表明，我们的模型在法庭记录文件上的F1得分达到87.53％，大大优于神经基线模型。进一步的分析表明，该方法可以有效识别实体之间的引用关系，并对实体依赖关系进行建模。

更新日期：2020-08-20

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11