当前位置: X-MOL 学术arXiv.cs.HC › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
On Minimizing Cost in Legal Document Review Workflows
arXiv - CS - Human-Computer Interaction Pub Date : 2021-06-18 , DOI: arxiv-2106.09866
Eugene Yang, David D. Lewis, Ophir Frieder

Technology-assisted review (TAR) refers to human-in-the-loop machine learning workflows for document review in legal discovery and other high recall review tasks. Attorneys and legal technologists have debated whether review should be a single iterative process (one-phase TAR workflows) or whether model training and review should be separate (two-phase TAR workflows), with implications for the choice of active learning algorithm. The relative cost of manual labeling for different purposes (training vs. review) and of different documents (positive vs. negative examples) is a key and neglected factor in this debate. Using a novel cost dynamics analysis, we show analytically and empirically that these relative costs strongly impact whether a one-phase or two-phase workflow minimizes cost. We also show how category prevalence, classification task difficulty, and collection size impact the optimal choice not only of workflow type, but of active learning method and stopping point.

中文翻译:

在法律文件审查工作流程中最小化成本

技术辅助审查 (TAR) 是指用于法律发现和其他高召回率审查任务中的文件审查的人工在环机器学习工作流程。律师和法律技术人员一直在争论审查是否应该是一个单一的迭代过程(一阶段 TAR 工作流程),或者模型训练和审查是否应该分开(两阶段 TAR 工作流程),这对主动学习算法的选择有影响。用于不同目的(培训与审查)和不同文档(正面与负面示例)的手动标记的相对成本是这场辩论中的一个关键且被忽视的因素。使用新的成本动态分析,我们通过分析和经验表明,这些相对成本强烈影响单阶段或两阶段工作流程是否使成本最小化。我们还展示了类别流行度,
更新日期:2021-06-25
down
wechat
bug