当前位置: X-MOL 学术arXiv.cs.IR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Learning to Truncate Ranked Lists for Information Retrieval
arXiv - CS - Information Retrieval Pub Date : 2021-02-25 , DOI: arxiv-2102.12793
Chen Wu, Ruqing Zhang, Jiafeng Guo, Yixing Fan, Yanyan Lan, Xueqi Cheng

Ranked list truncation is of critical importance in a variety of professional information retrieval applications such as patent search or legal search. The goal is to dynamically determine the number of returned documents according to some user-defined objectives, in order to reach a balance between the overall utility of the results and user efforts. Existing methods formulate this task as a sequential decision problem and take some pre-defined loss as a proxy objective, which suffers from the limitation of local decision and non-direct optimization. In this work, we propose a global decision based truncation model named AttnCut, which directly optimizes user-defined objectives for the ranked list truncation. Specifically, we take the successful transformer architecture to capture the global dependency within the ranked list for truncation decision, and employ the reward augmented maximum likelihood (RAML) for direct optimization. We consider two types of user-defined objectives which are of practical usage. One is the widely adopted metric such as F1 which acts as a balanced objective, and the other is the best F1 under some minimal recall constraint which represents a typical objective in professional search. Empirical results over the Robust04 and MQ2007 datasets demonstrate the effectiveness of our approach as compared with the state-of-the-art baselines.

中文翻译:

学习截断排名列表以进行信息检索

排序列表截断在诸如专利搜索或法律搜索之类的各种专业信息检索应用中至关重要。目的是根据一些用户定义的目标动态确定返回的文档数,以便在结果的整体效用和用户的努力之间达到平衡。现有的方法将该任务表述为一个顺序决策问题,并以一些预定义的损失作为代理目标,这受到局部决策和非直接优化的局限。在这项工作中,我们提出了一个名为AttnCut的基于全局决策的截断模型,该模型直接优化了用户定义的目标,用于排名列表截断。具体来说,我们采用成功的转换器架构来捕获排名列表中的全局依赖项,以进行截断决策,并采用奖励最大化最大似然(RAML)进行直接优化。我们考虑两种类型的用户定义目标,它们是实际用途。一种是被广泛采用的度量标准,例如F1,它充当平衡的目标,另一种是在最小召回约束下的最佳F1,它代表了专业搜索中的典型目标。与最新的基准相比,Robust04和MQ2007数据集的经验结果证明了我们方法的有效性。一种是被广泛采用的度量标准,例如F1,它充当平衡的目标,另一种是在最小召回约束下的最佳F1,它代表了专业搜索中的典型目标。与最新的基准相比,Robust04和MQ2007数据集的经验结果证明了我们方法的有效性。一种是被广泛采用的度量标准,例如F1,它充当平衡的目标,另一种是在最小召回约束下的最佳F1,它代表了专业搜索中的典型目标。与最新的基准相比,Robust04和MQ2007数据集的经验结果证明了我们方法的有效性。
更新日期:2021-02-26
down
wechat
bug