当前位置: X-MOL 学术Knowl. Eng. Rev. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
TempCourt: evaluation of temporal taggers on a new corpus of court decisions
The Knowledge Engineering Review ( IF 2.8 ) Pub Date : 2019-12-17 , DOI: 10.1017/s0269888919000195
María Navas-Loro , Erwin Filtz , Víctor Rodríguez-Doncel , Axel Polleres , Sabrina Kirrane

The extraction and processing of temporal expressions (TEs) in textual documents have been extensively studied in several domains; however, for the legal domain it remains an open challenge. This is possibly due to the scarcity of corpora in the domain and the particularities found in legal documents that are highlighted in this paper. Considering the pivotal role played by temporal information when it comes to analyzing legal cases, this paper presents TempCourt, a corpus of 30 legal documents from the European Court of Human Rights, the European Court of Justice, and the United States Supreme Court with manually annotated TEs. The corpus contains two different temporal annotation sets that adhere to the TimeML standard, the first one capturing all TEs and the second dedicated to TEs that are relevant for the case under judgment (thus excluding dates of previous court decisions). The proposed gold standards are subsequently used to compare ten state-of-the-art cross-domain temporal taggers, and to identify not only the limitations of cross-domain temporal taggers but also limitations of the TimeML standard when applied to legal documents. Finally, the paper identifies the need for dedicated resources and the adaptation of existing tools, and specific annotation guidelines that can be adapted to different types of legal documents.

中文翻译:

TempCourt:对新的法院判决语料库中的时间标记器进行评估

文本文档中时间表达 (TE) 的提取和处理已在多个领域得到广泛研究;然而,对于法律领域,它仍然是一个公开的挑战。这可能是由于该领域中语料库的稀缺性以及本文重点介绍的法律文件中的特殊性。考虑到时间信息在分析法律案件中所起的关键作用,本文介绍了 TempCourt,这是一个由欧洲人权法院、欧洲法院和美国最高法院的 30 份法律文件组成的语料库,并带有人工注释TE。语料库包含两个不同的时间注释集,它们符合 TimeML 标准,第一个捕获所有 TE,第二个专门用于与正在判决的案件相关的 TE(因此不包括先前法院判决的日期)。提议的黄金标准随后被用来比较十个最先进的跨域时间标记器,并且不仅确定跨域时间标记器的局限性,而且还确定了 TimeML 标准在应用于法律文件时的局限性。最后,本文确定了对专用资源的需求和对现有工具的改编,以及可以适应不同类型法律文件的具体注释指南。并且不仅要识别跨域时间标记器的局限性,还要识别 TimeML 标准在应用于法律文档时的局限性。最后,本文确定了对专用资源的需求和对现有工具的改编,以及可以适应不同类型法律文件的具体注释指南。并且不仅要识别跨域时间标记器的局限性,还要识别 TimeML 标准在应用于法律文档时的局限性。最后,本文确定了对专用资源的需求和对现有工具的改编,以及可以适应不同类型法律文件的具体注释指南。
更新日期:2019-12-17
down
wechat
bug