当前位置: X-MOL 学术Dev. Cogn. Neurosci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Reinforcement learning across development: What insights can we draw from a decade of research?
Developmental Cognitive Neuroscience ( IF 4.6 ) Pub Date : 2019-11-06 , DOI: 10.1016/j.dcn.2019.100733
Kate Nussenbaum 1 , Catherine A Hartley 1
Affiliation  

The past decade has seen the emergence of the use of reinforcement learning models to study developmental change in value-based learning. It is unclear, however, whether these computational modeling studies, which have employed a wide variety of tasks and model variants, have reached convergent conclusions. In this review, we examine whether the tuning of model parameters that govern different aspects of learning and decision-making processes vary consistently as a function of age, and what neurocognitive developmental changes may account for differences in these parameter estimates across development. We explore whether patterns of developmental change in these estimates are better described by differences in the extent to which individuals adapt their learning processes to the statistics of different environments, or by more static learning biases that emerge across varied contexts. We focus specifically on learning rates and inverse temperature parameter estimates, and find evidence that from childhood to adulthood, individuals become better at optimally weighting recent outcomes during learning across diverse contexts and less exploratory in their value-based decision-making. We provide recommendations for how these two possibilities — and potential alternative accounts — can be tested more directly to build a cohesive body of research that yields greater insight into the development of core learning processes.



中文翻译:


跨发展的强化学习:我们可以从十年的研究中得出哪些见解?



在过去的十年中,出现了使用强化学习模型来研究基于价值的学习的发展变化。然而,目前尚不清楚这些采用了多​​种任务和模型变体的计算建模研究是否已经得出了趋同的结论。在这篇综述中,我们研究了控制学习和决策过程不同方面的模型参数的调整是否随着年龄的变化而变化,以及哪些神经认知发育变化可能解释了这些参数估计在发育过程中的差异。我们探讨了这些估计中的发展变化模式是否可以通过个体将其学习过程适应不同环境的统计数据的程度的差异更好地描述,或者通过不同环境中出现的更静态的学习偏差来更好地描述。我们特别关注学习率和逆温度参数估计,并发现证据表明,从童年到成年,个体在跨不同背景的学习过程中能够更好地对近期结果进行最佳权衡,并且在基于价值的决策中探索性较少。我们提供了如何更直接地测试这两种可能性以及潜在的替代解释的建议,以建立一个有凝聚力的研究体系,从而对核心学习过程的发展产生更深入的了解。

更新日期:2019-11-06
down
wechat
bug