当前位置: X-MOL 学术arXiv.cs.IR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Reinforcement Learning for Strategic Recommendations
arXiv - CS - Information Retrieval Pub Date : 2020-09-15 , DOI: arxiv-2009.07346
Georgios Theocharous, Yash Chandak, Philip S. Thomas, Frits de Nijs

Strategic recommendations (SR) refer to the problem where an intelligent agent observes the sequential behaviors and activities of users and decides when and how to interact with them to optimize some long-term objectives, both for the user and the business. These systems are in their infancy in the industry and in need of practical solutions to some fundamental research challenges. At Adobe research, we have been implementing such systems for various use-cases, including points of interest recommendations, tutorial recommendations, next step guidance in multi-media editing software, and ad recommendation for optimizing lifetime value. There are many research challenges when building these systems, such as modeling the sequential behavior of users, deciding when to intervene and offer recommendations without annoying the user, evaluating policies offline with high confidence, safe deployment, non-stationarity, building systems from passive data that do not contain past recommendations, resource constraint optimization in multi-user systems, scaling to large and dynamic actions spaces, and handling and incorporating human cognitive biases. In this paper we cover various use-cases and research challenges we solved to make these systems practical.

中文翻译:

战略建议的强化学习

战略推荐 (SR) 是指智能代理观察用户的连续行为和活动并决定何时以及如何与他们交互以优化用户和业务的一些长期目标的问题。这些系统在行业中处于起步阶段,需要针对一些基础研究挑战的实用解决方案。在 Adob​​e 研究中,我们一直在为各种用例实施此类系统,包括兴趣点推荐、教程推荐、多媒体编辑软件的下一步指导以及优化生命周期价值的广告推荐。在构建这些系统时存在许多研究挑战,例如对用户的顺序行为建模,决定何时进行干预并提供建议而不打扰用户,以高置信度离线评估策略,安全部署,非平稳性,从不包含过去建议的被动数据构建系统,多用户系统中的资源约束优化,扩展到大型和动态动作空间,以及处理和合并人类认知偏差. 在本文中,我们涵盖了为使这些系统实用而解决的各种用例和研究挑战。
更新日期:2020-09-17
down
wechat
bug