Unbiased Learning to Rank,ACM Transactions on Information Systems

当前位置： X-MOL 学术 › ACM Trans. Inf. Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Unbiased Learning to Rank
ACM Transactions on Information Systems ( IF 5.4 ) Pub Date : 2021-02-20 , DOI: 10.1145/3439861
Qingyao Ai ₁ , Tao Yang ₁ , Huazheng Wang ₂ , Jiaxin Mao ₃

Affiliation

How to obtain an unbiased ranking model by learning to rank with biased user feedback is an important research question for IR. Existing work on unbiased learning to rank (ULTR) can be broadly categorized into two groups—the studies on unbiased learning algorithms with logged data, namely, the offline unbiased learning, and the studies on unbiased parameters estimation with real-time user interactions, namely, the online learning to rank. While their definitions of unbiasness are different, these two types of ULTR algorithms share the same goal—to find the best models that rank documents based on their intrinsic relevance or utility. However, most studies on offline and online unbiased learning to rank are carried in parallel without detailed comparisons on their background theories and empirical performance. In this article, we formalize the task of unbiased learning to rank and show that existing algorithms for offline unbiased learning and online learning to rank are just the two sides of the same coin. We evaluate eight state-of-the-art ULTR algorithms and find that many of them can be used in both offline settings and online environments with or without minor modifications. Further, we analyze how different offline and online learning paradigms would affect the theoretical foundation and empirical effectiveness of each algorithm on both synthetic and real search data. Our findings provide important insights and guidelines for choosing and deploying ULTR algorithms in practice.

中文翻译：

无偏学习排名

如何通过学习使用有偏见的用户反馈进行排名来获得无偏见的排名模型是 IR 的一个重要研究问题。现有的无偏学习排序 (ULTR) 工作可以大致分为两组 - 对记录数据的无偏学习算法的研究，即离线无偏学习，以及关于实时用户交互的无偏参数估计的研究，即在线的学习排名。虽然他们的定义不偏不倚不同的是，这两种类型的 ULTR 算法有着相同的目标——找到根据内在相关性或效用对文档进行排名的最佳模型。然而，大多数关于离线和在线无偏学习排名的研究都是并行进行的，没有对其背景理论和实证表现进行详细比较。在本文中，我们将无偏学习排名的任务形式化，并表明现有的离线无偏学习和在线学习排名算法只是同一枚硬币的两个方面。我们评估了八种最先进的 ULTR 算法，发现其中许多算法可以在离线设置和在线环境中使用，无论是否进行微小修改。进一步，我们分析了不同的离线和在线学习范式将如何影响每种算法在合成和真实搜索数据上的理论基础和经验有效性。我们的发现为在实践中选择和部署 ULTR 算法提供了重要的见解和指导。

更新日期：2021-02-20

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11