当前位置: X-MOL 学术User Model. User-Adap. Inter. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Personalized task difficulty adaptation based on reinforcement learning
User Modeling and User-Adapted Interaction ( IF 3.6 ) Pub Date : 2021-04-22 , DOI: 10.1007/s11257-021-09292-w
Yaqian Zhang , Wooi-Boon Goh

Traditionally, the task difficulty level is often determined by domain experts based on some hand-crafted rules. However, with the adoption of Massive Open Online Courses (MOOCs), it has become harder to manually personalize task difficulty as the system designers are faced with a very large question bank and a user base of individuals with diverse backgrounds and ability levels. This research focuses on developing a data-driven method to adaptively adjust difficulty levels in order to maintain a target user performance level over a series of tasks whose difficulty level is highly variable among different individuals. Specifically, the issue of difficulty adaptation was formulated as a reinforcement learning problem. To ensure responsiveness of the interactive systems, a novel bootstrapped policy gradient (BPG) framework was developed, which can incorporate prior knowledge of difficulty ranking into policy gradient to enhance sample efficiency. To obtain high-quality prior information on difficulty ranking, a clustering-based approach was proposed which can learn a personalized difficulty ranking to capture users’ individual differences. To evaluate the effectiveness of the difficulty adaptation method, we focused on a visual memory training problem with a large question bank and a diverse user base. Specifically, the proposed algorithms were combined and applied to a real-world application consisting of an online visual-spatial memory recall game and were shown to outperform the traditional rule-based adaptation approach in adapting to the slow players while achieving comparable performance in adapting to the fast players.



中文翻译:

基于强化学习的个性化任务难度适应

传统上,任务难度级别通常由领域专家根据一些手工制定的规则来确定。但是,随着大规模开放在线课程(MOOC)的采用,由于系统设计人员面临着非常庞大的问题库和具有不同背景和能力水平的个人用户群,因此手动个性化任务难度变得更加困难。这项研究的重点是开发一种数据驱动的方法来自适应地调整难度级别,以便在一系列任务的难度级别在不同个体之间变化很大的情况下,维持目标用户的性能级别。具体而言,难度适应问题被表述为强化学习问题。为了确保互动系统的响应能力,开发了一种新颖的自举策略梯度(BPG)框架,可以将难度排序的先验知识整合到策略梯度中,以提高样本效率。为了获得有关难度排名的高质量先验信息,提出了一种基于聚类的方法,该方法可以学习个性化的难度排名以捕获用户的个体差异。为了评估难度调整方法的有效性,我们重点研究了视觉记忆训练问题,该问题具有较大的问题库和不同的用户群。具体而言,所提出的算法被组合并应用于由在线视觉空间记忆召回游戏组成的现实世界应用中,并且在适应速度较慢的玩家方面表现出优于传统的基于规则的适应方法,同时在适应能力方面达到了可比的性能。快速的球员。

更新日期:2021-04-22
down
wechat
bug