当前位置: X-MOL 学术arXiv.cs.DL › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
VerbCL: A Dataset of Verbatim Quotes for Highlight Extraction in Case Law
arXiv - CS - Digital Libraries Pub Date : 2021-08-23 , DOI: arxiv-2108.10120
Julien Rossi, Svitlana Vakulenko, Evangelos Kanoulas

Citing legal opinions is a key part of legal argumentation, an expert task that requires retrieval, extraction and summarization of information from court decisions. The identification of legally salient parts in an opinion for the purpose of citation may be seen as a domain-specific formulation of a highlight extraction or passage retrieval task. As similar tasks in other domains such as web search show significant attention and improvement, progress in the legal domain is hindered by the lack of resources for training and evaluation. This paper presents a new dataset that consists of the citation graph of court opinions, which cite previously published court opinions in support of their arguments. In particular, we focus on the verbatim quotes, i.e., where the text of the original opinion is directly reused. With this approach, we explain the relative importance of different text spans of a court opinion by showcasing their usage in citations, and measuring their contribution to the relations between opinions in the citation graph. We release VerbCL, a large-scale dataset derived from CourtListener and introduce the task of highlight extraction as a single-document summarization task based on the citation graph establishing the first baseline results for this task on the VerbCL dataset.

中文翻译:

VerbCL:判例法中用于重点提取的逐字报价数据集

引用法律意见是法律论证的关键部分,这是一项专家任务,需要从法院判决中检索、提取和总结信息。出于引用的目的,在意见中识别法律上的显着部分可以被视为重点提取或段落检索任务的特定领域表述。由于网络搜索等其他领域的类似任务显示出显着的关注和改进,法律领域的进展因缺乏培训和评估资源而受到阻碍。本文提出了一个新的数据集,其中包含法院意见的引文图,其中引用了先前发表的法院意见来支持他们的论点。特别是,我们关注逐字引用,即直接重用原始意见的文本。通过这种方法,我们通过展示它们在引文中的用法并衡量它们对引文图中意见之间关系的贡献来解释法院意见的不同文本跨度的相对重要性。我们发布了 VerbCL,这是一个源自 CourtListener 的大规模数据集,并基于在 VerbCL 数据集上建立此任务的第一个基线结果的引文图,将重点提取任务作为单文档摘要任务引入。
更新日期:2021-08-24
down
wechat
bug