RLPS: A Reinforcement Learning–Based Framework for Personalized Search,ACM Transactions on Information Systems

当前位置： X-MOL 学术 › ACM Trans. Inf. Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

RLPS: A Reinforcement Learning–Based Framework for Personalized Search
ACM Transactions on Information Systems ( IF 5.6 ) Pub Date : 2021-05-06 , DOI: 10.1145/3446617
Jing Yao ₁ , Zhicheng Dou ₂ , Jun Xu ₂ , Ji-Rong Wen ₃

Affiliation

Personalized search is a promising way to improve search qualities by taking user interests into consideration. Recently, machine learning and deep learning techniques have been successfully applied to search result personalization. Most existing models simply regard the personal search history as a static set of user behaviors and learn fixed ranking strategies based on all the recorded data. Though improvements have been achieved, the essence that the search process is a sequence of interactions between the search engine and user is ignored. The user’s interests may dynamically change during the search process, therefore, it would be more helpful if a personalized search model could track the whole interaction process and adjust its ranking strategy continuously. In this article, we adapt reinforcement learning to personalized search and propose a framework, referred to as RLPS. It utilizes a Markov Decision Process ( MDP ) to track sequential interactions between the user and search engine, and continuously update the underlying personalized ranking model with the user’s real-time feedback to learn the user’s dynamic interests. Within this framework, we implement two models: the listwise RLPS-L and the hierarchical RLPS-H. RLPS-L interacts with users and trains the ranking model with document lists, while RLPS-H improves model training by designing a layered structure and introducing document pairs. In addition, we also design a feedback-aware personalized ranking component to capture the user’s feedback, which impacts the user interest profile for the next query. Significant improvements over existing personalized search models are observed in the experiments on the public AOL search log and a commercial log.

中文翻译：

RLPS：基于强化学习的个性化搜索框架

个性化搜索是一种通过考虑用户兴趣来提高搜索质量的有前途的方法。最近，机器学习和深度学习技术已成功应用于搜索结果个性化。大多数现有模型只是将个人搜索历史视为一组静态的用户行为，并根据所有记录的数据学习固定的排名策略。尽管已经取得了改进，但搜索过程是搜索引擎和用户之间的一系列交互的本质被忽略了。用户的兴趣在搜索过程中可能会发生动态变化，因此如果个性化搜索模型能够跟踪整个交互过程并不断调整其排名策略将更有帮助。在本文中，我们将强化学习应用于个性化搜索，并提出了一个称为 RLPS 的框架。它利用一个马尔可夫决策过程(MDP) 跟踪用户与搜索引擎之间的顺序交互，并根据用户的实时反馈不断更新底层的个性化排名模型，以了解用户的动态兴趣。在这个框架内，我们实现了两个模型：listwise RLPS-L 和分层 RLPS-H。RLPS-L 与用户交互并使用文档列表训练排名模型，而 RLPS-H 通过设计分层结构和引入文档对来改进模型训练。此外，我们还设计了一个反馈感知个性化排名组件来捕获用户的反馈，这会影响下一个查询的用户兴趣配置文件。在公共 AOL 搜索日志和商业日志的实验中观察到对现有个性化搜索模型的显着改进。

更新日期：2021-05-06

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>