Personalized task difficulty adaptation based on reinforcement learning

Zhang, Yaqian; Goh, Wooi-Boon

doi:10.1007/s11257-021-09292-w

Personalized task difficulty adaptation based on reinforcement learning

Published: 22 April 2021

Volume 31, pages 753–784, (2021)
Cite this article

User Modeling and User-Adapted Interaction Aims and scope Submit manuscript

1019 Accesses
10 Citations
Explore all metrics

Abstract

Traditionally, the task difficulty level is often determined by domain experts based on some hand-crafted rules. However, with the adoption of Massive Open Online Courses (MOOCs), it has become harder to manually personalize task difficulty as the system designers are faced with a very large question bank and a user base of individuals with diverse backgrounds and ability levels. This research focuses on developing a data-driven method to adaptively adjust difficulty levels in order to maintain a target user performance level over a series of tasks whose difficulty level is highly variable among different individuals. Specifically, the issue of difficulty adaptation was formulated as a reinforcement learning problem. To ensure responsiveness of the interactive systems, a novel bootstrapped policy gradient (BPG) framework was developed, which can incorporate prior knowledge of difficulty ranking into policy gradient to enhance sample efficiency. To obtain high-quality prior information on difficulty ranking, a clustering-based approach was proposed which can learn a personalized difficulty ranking to capture users’ individual differences. To evaluate the effectiveness of the difficulty adaptation method, we focused on a visual memory training problem with a large question bank and a diverse user base. Specifically, the proposed algorithms were combined and applied to a real-world application consisting of an online visual-spatial memory recall game and were shown to outperform the traditional rule-based adaptation approach in adapting to the slow players while achieving comparable performance in adapting to the fast players.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Balanced difficulty task finder: an adaptive recommendation method for learning tasks based on the concept of state of flow

Article Open access 27 August 2020

Anis Yazidi, Asieh Abolpour Mofrad, … Erik Arntzen

Combining Difficulty Ranking with Multi-Armed Bandits to Sequence Educational Content

Recommending Mathematical Tasks Based on Reinforcement Learning and Item Response Theory

Notes

In the case of \({{{\overset{\scriptscriptstyle \frown }{\pi }}}_{\theta }}({{a}_{i}})=0\), the gradient update is set to be zero by letting \({{{\overset{\scriptscriptstyle \frown }{\pi }}}_{\theta }}({{a}_{i}})\) to be equal to a constant.
To avoid a negative score, the minimum of the game score is set to zero.
The task posted on Mechanical Turk platform was open to the participants from all the countries with acceptance rates over 95%.
The target level used in this experiment was chosen by a preliminary study which employs a random selection method. The median and mean of memorization time lie in the range of the 4th time bubble, i.e., 4200–5200 ms

References

Alvarez, G.A., Cavanagh, P.: The capacity of visual short-term memory is set both by visual information load and by number of objects. Psychol. sci. 15(2), 106–111 (2004)
Article Google Scholar
Andrade, G., Ramalho, G., Santana, H., Corruble, V.: Challenge-sensitive action selection: an application to game balancing. In: IEEE/WIC/ACM International Conference on Intelligent Agent Technology, IEEE, pp. 194–200 (2005)
Babcock, B., Weiss, D.: Termination criteria in computerized adaptive tests: Variable-length cats are not biased. In: Proceedings of the 2009 GMAC conference on computerized adaptive testing, vol. 14 (2009)
Bays, P.M., Husain, M.: Dynamic shifts of limited working memory resources in human vision. Science 321(5890), 851–854 (2008)
Article Google Scholar
Booth, M.: The ai systems of left 4 dead. In: Artificial Intelligence and Interactive Digital Entertainment Conference at Stanford, 2009 (2009)
Brady, T.F., Konkle, T., Alvarez, G.A.: A review of visual memory capacity: Beyond individual items and toward structured representations. J. Vis. 11(5), 4 (2011)
Article Google Scholar
Csikszentmihalyi, M.: Toward a psychology of optimal experience. In: Flow and the foundations of positive psychology, Springer, pp. 209–226 (2014)
Danzi, G., Santana, A.H.P., Furtado, A.W.B., Gouveia, A.R., Leitao, A., Ramalho, G.L.: Online adaptation of computer games agents: A reinforcement learning approach. In: II Workshop de Jogos e Entretenimento Digital, pp. 105–112 (2003)
Guzmán, E., Conejo, R.: A model for student knowledge diagnosis through adaptive testing. In: International Conference on Intelligent Tutoring Systems, Springer, pp. 12–21 (2004)
Guzmán, E., Conejo, R.: Self-assessment in a feasible, adaptive web-based testing system. IEEE Trans. Educ. 48(4), 688–695 (2005)
Article Google Scholar
Guzman, E., Conejo, R., Perez-de-la Cruz, J.L.: Improving student performance using self-assessment tests. IEEE Intell. Syst. 22(4), 46–52 (2007)
Article Google Scholar
Holmes, J., Gathercole, S.E., Dunning, D.L.: Adaptive training leads to sustained enhancement of poor working memory in children. Dev. Sci. 12(4), F9–F15 (2009)
Article Google Scholar
Jennings-Teats, M., Smith, G., Wardrip-Fruin, N.: Polymorph: dynamic difficulty adjustment through level generation. In: Proceedings of the 2010 Workshop on Procedural Content Generation in Games, ACM, p. 11 (2010)
Klingberg, T., Fernell, E., Olesen, P.J., Johnson, M., Gustafsson, P., Dahlström, K., Gillberg, C.G., Forssberg, H., Westerberg, H.: Computerized training of working memory in children with adhd-a randomized, controlled trial. J. Am. Acad. Child & Adolesc. Psychiatr. 44(2), 177–186 (2005)
Article Google Scholar
Lan, A.S., Baraniuk, R.G.: A contextual bandits framework for personalized learning action selection. In: EDM, pp. 424–429 (2016)
Van der Linden, W.J., Glas, C.A., et al.: Computerized adaptive testing: Theory and practice. Springer (2000)
Book Google Scholar
Liu, C., Agrawal, P., Sarkar, N., Chen, S.: Dynamic difficulty adjustment in computer games through real-time anxiety-based affective feedback. Int. J. Human-Comput. Interact. 25(6), 506–529 (2009)
Article Google Scholar
Luck, S.J., Vogel, E.K.: The capacity of visual working memory for features and conjunctions. Nature 390(6657), 279 (1997)
Article Google Scholar
Okpo, J., Masthoff, J., Dennis, M., Beacham, N.: Conceptualizing a framework for adaptive exercise selection with personality as a major learner characteristic. In: Adjunct publication of the 25th conference on user modeling, adaptation and personalization, pp. 293–298 (2017)
Okpo, J., Masthoff, J., Dennis, M., Beacham, N., Ciocarlan, A.: Investigating the impact of personality and cognitive efficiency on the selection of exercises for learners. In: Proceedings of the 25th conference on user modeling, adaptation and personalization, pp. 140–147 (2017)
Olesen, P.J., Westerberg, H., Klingberg, T.: Increased prefrontal and parietal activity after training of working memory. Nat. Neurosci. 7(1), 75 (2004)
Article Google Scholar
Papoušek, J., Pelánek, R.: Impact of adaptive educational system behaviour on student motivation. In: International Conference on Artificial Intelligence in Education, Springer, pp. 348–357 (2015)
Papoušek, J., Stanislav, V., Pelánek, R.: Impact of question difficulty on engagement and learning. In: International Conference on Intelligent Tutoring Systems, Springer, pp. 267–272 (2016)
Rapport, M.D., Orban, S.A., Kofler, M.J., Friedman, L.M.: Do programs designed to train working memory, other executive functions, and attention benefit children with adhd? a meta-analytic review of cognitive, academic, and behavioral outcomes. Clin. Psychol. Rev. 33(8), 1237–1252 (2013)
Article Google Scholar
Sampayo-Vargas, S., Cope, C.J., He, Z., Byrne, G.J.: The effectiveness of adaptive difficulty adjustments on students’ motivation and learning in an educational computer game. Comput. & Educ. 69, 452–462 (2013)
Article Google Scholar
Segal, A., David, Y.B., Williams, J.J., Gal, K., Shalom, Y.: Combining difficulty ranking with multi-armed bandits to sequence educational content. In: International Conference on Artificial Intelligence in Education, Springer, pp. 317–321 (2018)
Shani, G., Gunawardana, A.: Evaluating recommendation systems. In: Recommender systems handbook, Springer, pp. 257–297 (2011)
Shani, G., Shapira, B.: Edurank: A collaborative filtering approach to personalization in e-learning. Educational data mining pp. 68–75 (2014)
Swanson, H.L.: Working memory, attention, and mathematical problem solving: A longitudinal study of elementary school children. J. Educ. Psychol. 103(4), 821 (2011)
Article Google Scholar
Togelius, J., De Nardi, R., Lucas, S.M.: Towards automatic personalised content creation for racing games. In: 2007 IEEE Symposium on Computational Intelligence and Games, IEEE, pp. 252–259 (2007)
Vogel, E.K., Machizawa, M.G.: Neural activity predicts individual differences in visual working memory capacity. Nature 428(6984), 748 (2004)
Article Google Scholar
Vygotsky, L.: Interaction between learning and development. Read. Develop. Child. 23(3), 34–41 (1978)
Google Scholar
Wauters, K., Desmet, P., Van Den Noortgate, W.: Adaptive item-based learning environments based on the item response theory: Possibilities and challenges. J. Comput. Ass. Learn. 26(6), 549–562 (2010)
Article Google Scholar
Xu, Y., Chun, M.M.: Visual grouping in human parietal cortex. Proc. Natl. Acad. Sci. 104(47), 18766–18771 (2007)
Article Google Scholar
Yao, Y.: Measuring retrieval effectiveness based on user preference of documents. J. Am. Soc. Inf. Sci. 46(2), 133–145 (1995)
Article Google Scholar
Zhang, Y., Goh, W.B.: Bootstrapped policy gradient for difficulty adaptation in intelligent tutoring systems. In: Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, International Foundation for Autonomous Agents and Multiagent Systems, pp. 711–719 (2019)
Zhang, Y., Mańdziuk, J., Quek, C.H., Goh, B.W.: Curvature-based method for determining the number of clusters. Inform. Sci. 415, 414–428 (2017)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Shanghai Jiao Tong University, Shanghai, China
Yaqian Zhang
Nanyang Technological University, Singapore, Singapore
Wooi-Boon Goh

Authors

Yaqian Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Wooi-Boon Goh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yaqian Zhang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Visual Memory Tasks

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, Y., Goh, WB. Personalized task difficulty adaptation based on reinforcement learning. User Model User-Adap Inter 31, 753–784 (2021). https://doi.org/10.1007/s11257-021-09292-w

Download citation

Received: 29 January 2020
Accepted: 25 February 2021
Published: 22 April 2021
Issue Date: September 2021
DOI: https://doi.org/10.1007/s11257-021-09292-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Personalized task difficulty adaptation based on reinforcement learning

Abstract

Access this article

Similar content being viewed by others

Balanced difficulty task finder: an adaptive recommendation method for learning tasks based on the concept of state of flow

Combining Difficulty Ranking with Multi-Armed Bandits to Sequence Educational Content

Recommending Mathematical Tasks Based on Reinforcement Learning and Item Response Theory

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix: Visual Memory Tasks

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Personalized task difficulty adaptation based on reinforcement learning

Abstract

Access this article

Similar content being viewed by others

Balanced difficulty task finder: an adaptive recommendation method for learning tasks based on the concept of state of flow

Combining Difficulty Ranking with Multi-Armed Bandits to Sequence Educational Content

Recommending Mathematical Tasks Based on Reinforcement Learning and Item Response Theory

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix: Visual Memory Tasks

Appendix: Visual Memory Tasks

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation