当前位置: X-MOL 学术arXiv.cs.IR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Goldilocks: Just-Right Tuning of BERT for Technology-Assisted Review
arXiv - CS - Information Retrieval Pub Date : 2021-05-03 , DOI: arxiv-2105.01044
Eugene Yang, Sean MacAvaney, David D. Lewis, Ophir Frieder

Technology-assisted review (TAR) refers to iterative active learning workflows for document review in high recall retrieval (HRR) tasks. TAR research and most commercial TAR software have applied linear models such as logistic regression or support vector machines to lexical features. Transformer-based models with supervised tuning have been found to improve effectiveness on many text classification tasks, suggesting their use in TAR. We indeed find that the pre-trained BERT model reduces review volume by 30% in TAR workflows simulated on the RCV1-v2 newswire collection. In contrast, we find that linear models outperform BERT for simulated legal discovery topics on the Jeb Bush e-mail collection. This suggests the match between transformer pre-training corpora and the task domain is more important than generally appreciated. Additionally, we show that just-right language model fine-tuning on the task collection before starting active learning is critical. Both too little or too much fine-tuning results in performance worse than that of linear models, even for RCV1-v2.

中文翻译:

Goldilocks:BERT的右调,用于技术辅助审核

技术辅助审阅(TAR)指的是在高召回率检索(HRR)任务中进行文档审阅的迭代式主动学习工作流程。TAR研究和大多数商业TAR软件已将线性模型(例如逻辑回归或支持向量机)应用于词汇特征。已发现具有监督调整功能的基于变压器的模型可以提高许多文本分类任务的效率,这表明它们可在TAR中使用。我们确实发现,经过预训练的BERT模型在RCV1-v2新闻专集上模拟的TAR工作流程中将评论量减少了30%。相比之下,对于Jeb Bush电子邮件集合中模拟的法律发现主题,我们发现线性模型的性能优于BERT。这表明变压器预训练语料库和任务域之间的匹配比普遍认可的更为重要。此外,我们表明,在开始主动学习之前,对任务集合进行正确的语言模型微调是至关重要的。太少或太少的微调都会导致性能比线性模型差,即使对于RCV1-v2也是如此。
更新日期:2021-05-04
down
wechat
bug