A Deep Reinforcement Learning Recommender System With Multiple Policies for Recommendations,IEEE Transactions on Industrial Informatics

当前位置： X-MOL 学术 › IEEE Trans. Ind. Inform. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A Deep Reinforcement Learning Recommender System With Multiple Policies for Recommendations
IEEE Transactions on Industrial Informatics ( IF 11.7 ) Pub Date : 9-26-2022 , DOI: 10.1109/tii.2022.3209290
Mingsheng Fu ₁ , Liwei Huang ₁ , Ananya Rao ₂ , Athirai A. Irissappane ₂ , Jie Zhang ₃ , Hong Qu ₁

Affiliation

Deep reinforcement learning (DRL) based recommender systems are suitable for user cold-start problems as they can capture user preferences progressively. However, most existing DRL-based recommender systems are suboptimal, since they use the same policy to suit the dynamics of different users. We reformulate recommendation as a multitask Markov Decision Process, where each task represents a set of similar users. Since similar users have closer dynamics, a task-specific policy is more effective than a single universal policy for all users. To make recommendations for cold-start users, we use a default policy to collect some initial interactions to identify the user task, after which a task-specific policy is employed. We use Q-learning to optimize our framework and consider the task uncertainty by the mutual information regarding tasks. Experiments are conducted on three real-world datasets to verify the effectiveness of our proposed framework.

中文翻译：

具有多种推荐策略的深度强化学习推荐系统

基于深度强化学习（DRL）的推荐系统适合解决用户冷启动问题，因为它们可以逐步捕获用户偏好。然而，大多数现有的基于 DRL 的推荐系统都不是最优的，因为它们使用相同的策略来适应不同用户的动态。我们将推荐重新表述为多任务马尔可夫决策过程，其中每个任务代表一组相似的用户。由于相似的用户具有更密切的动态，因此针对所有用户而言，特定于任务的策略比单一通用策略更有效。为了向冷启动用户提供建议，我们使用默认策略来收集一些初始交互来识别用户任务，然后采用特定于任务的策略。我们使用 Q-learning 来优化我们的框架，并通过任务的相互信息来考虑任务的不确定性。在三个现实世界数据集上进行实验，以验证我们提出的框架的有效性。

更新日期：2024-08-22

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11