当前位置: X-MOL 学术arXiv.cs.IR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval
arXiv - CS - Information Retrieval Pub Date : 2021-01-02 , DOI: arxiv-2101.00436
Omar Khattab, Christopher Potts, Matei Zaharia

Multi-hop reasoning (i.e., reasoning across two or more documents) at scale is a key step toward NLP models that can exhibit broad world knowledge by leveraging large collections of documents. We propose Baleen, a system that improves the robustness and scalability of multi-hop reasoning over current approaches. Baleen introduces a per-hop condensed retrieval pipeline to mitigate the size of the search space, a focused late interaction retriever (FliBERT) that can model complex multi-hop queries, and a weak supervision strategy, latent hop ordering, to learn from limited signal about which documents to retrieve for a query. We evaluate Baleen on the new many-hop claim verification dataset HoVer, establishing state-of-the-art performance.

中文翻译:

Baleen:通过压缩检索进行大规模鲁棒的多跳推理

大规模的多跳推理(即,对两个或多个文档进行推理)是迈向NLP模型的关键一步,该模型可通过利用大量文档来展现广泛的世界知识。我们提出Baleen,它是一种比当前方法提高多跳推理的鲁棒性和可伸缩性的系统。Baleen引入了按跳的精简检索管道来减轻搜索空间的大小,引入可以对复杂的多跳查询进行建模的集中式后期交互检索器(FliBERT),以及一种弱监督策略(潜跳排序),以从有限的信号中学习有关要查询的文档的信息。我们在新的多跳索赔验证数据集HoVer上对Baleen进行评估,从而建立了最先进的性能。
更新日期:2021-01-05
down
wechat
bug