当前位置: X-MOL 学术IEEE Trans. Ind. Inform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Deep Reinforcement Learning Recommender System With Multiple Policies for Recommendations
IEEE Transactions on Industrial Informatics ( IF 11.7 ) Pub Date : 9-26-2022 , DOI: 10.1109/tii.2022.3209290
Mingsheng Fu 1 , Liwei Huang 1 , Ananya Rao 2 , Athirai A. Irissappane 2 , Jie Zhang 3 , Hong Qu 1
Affiliation  

Deep reinforcement learning (DRL) based recommender systems are suitable for user cold-start problems as they can capture user preferences progressively. However, most existing DRL-based recommender systems are suboptimal, since they use the same policy to suit the dynamics of different users. We reformulate recommendation as a multitask Markov Decision Process, where each task represents a set of similar users. Since similar users have closer dynamics, a task-specific policy is more effective than a single universal policy for all users. To make recommendations for cold-start users, we use a default policy to collect some initial interactions to identify the user task, after which a task-specific policy is employed. We use Q-learning to optimize our framework and consider the task uncertainty by the mutual information regarding tasks. Experiments are conducted on three real-world datasets to verify the effectiveness of our proposed framework.

中文翻译:


具有多种推荐策略的深度强化学习推荐系统



基于深度强化学习(DRL)的推荐系统适合解决用户冷启动问题,因为它们可以逐步捕获用户偏好。然而,大多数现有的基于 DRL 的推荐系统都不是最优的,因为它们使用相同的策略来适应不同用户的动态。我们将推荐重新表述为多任务马尔可夫决策过程,其中每个任务代表一组相似的用户。由于相似的用户具有更密切的动态,因此针对所有用户而言,特定于任务的策略比单一通用策略更有效。为了向冷启动用户提供建议,我们使用默认策略来收集一些初始交互来识别用户任务,然后采用特定于任务的策略。我们使用 Q-learning 来优化我们的框架,并通过任务的相互信息来考虑任务的不确定性。在三个现实世界数据集上进行实验,以验证我们提出的框架的有效性。
更新日期:2024-08-22
down
wechat
bug