Legal case document similarity: You need both network and text,Information Processing & Management

当前位置： X-MOL 学术 › Inf. Process. Manag. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Legal case document similarity: You need both network and text
Information Processing & Management ( IF 8.6 ) Pub Date : 2022-09-05 , DOI: 10.1016/j.ipm.2022.103069
Paheli Bhattacharya , Kripabandhu Ghosh , Arindam Pal , Saptarshi Ghosh

Estimating the similarity between two legal case documents is an important and challenging problem, having various downstream applications such as prior-case retrieval and citation recommendation. There are two broad approaches for the task — citation network-based and text-based. Prior citation network-based approaches consider citations only to prior-cases (also called precedents) (PCNet). This approach misses important signals inherent in Statutes (written laws of a jurisdiction). In this work, we propose Hier-SPCNet that augments PCNet with a heterogeneous network of Statutes. We incorporate domain knowledge for legal document similarity into Hier-SPCNet, thereby obtaining state-of-the-art results for network-based legal document similarity.

Both textual and network similarity provide important signals for legal case similarity; but till now, only trivial attempts have been made to unify the two signals. In this work, we apply several methods for combining textual and network information for estimating legal case similarity. We perform extensive experiments over legal case documents from the Indian judiciary, where the gold standard similarity between document-pairs is judged by law experts from two reputed Law institutes in India. Our experiments establish that our proposed network-based methods significantly improve the correlation with domain experts’ opinion when compared to the existing methods for network-based legal document similarity. Our best-performing combination method (that combines network-based and text-based similarity) improves the correlation with domain experts’ opinion by 11.8% over the best text-based method and 20.6% over the best network-based method. We also establish that our best-performing method can be used to recommend/retrieve citable and similar cases for a source (query) case, which are well appreciated by legal experts.

中文翻译：

法律案例文档相似性：您需要网络和文本

估计两个法律案例文档之间的相似性是一个重要且具有挑战性的问题，具有各种下游应用，例如先验案例检索和引文推荐。该任务有两种广泛的方法——基于引文网络和基于文本。基于先前引用网络的方法仅考虑对先前案例（也称为先例）（PCNet）的引用。这种方法错过了法规（司法管辖区的成文法）中固有的重要信号。在这项工作中，我们提出了 Hier-SPCNet，它通过异构的 Statutes 网络来增强 PCNet。我们将法律文件相似性的领域知识整合到 Hier-SPCNet 中，从而获得基于网络的法律文件相似性的最新结果。

文本相似度和网络相似度都为法律案件相似度提供了重要信号；但到目前为止，统一这两个信号的尝试只是微不足道的尝试。在这项工作中，我们应用了几种方法来结合文本和网络信息来估计法律案件的相似性。我们对来自印度司法机构的法律案例文件进行了广泛的实验，其中文件对之间的黄金标准相似性由印度两家著名法律机构的法律专家判断。我们的实验表明，与现有的基于网络的法律文件相似性方法相比，我们提出的基于网络的方法显着提高了与领域专家意见的相关性。我们表现最好的组合方法（结合了基于网络和基于文本的相似性）将与领域专家意见的相关性比基于文本的最佳方法提高了 11.8%，比基于网络的最佳方法提高了 20.6%。我们还确定，我们表现最佳的方法可用于推荐/检索源（查询）案例的可引用案例和类似案例，这些案例深受法律专家的赞赏。

更新日期：2022-09-05

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>