当前位置: X-MOL 学术Proc. Natl. Acad. Sci. U.S.A. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Computational evidence for hierarchically structured reinforcement learning in humans [Colloquium Papers (free online)]
Proceedings of the National Academy of Sciences of the United States of America ( IF 11.1 ) Pub Date : 2020-11-24 , DOI: 10.1073/pnas.1912330117
Maria K. Eckstein 1 , Anne G. E. Collins 1
Affiliation  

Humans have the fascinating ability to achieve goals in a complex and constantly changing world, still surpassing modern machine-learning algorithms in terms of flexibility and learning speed. It is generally accepted that a crucial factor for this ability is the use of abstract, hierarchical representations, which employ structure in the environment to guide learning and decision making. Nevertheless, how we create and use these hierarchical representations is poorly understood. This study presents evidence that human behavior can be characterized as hierarchical reinforcement learning (RL). We designed an experiment to test specific predictions of hierarchical RL using a series of subtasks in the realm of context-based learning and observed several behavioral markers of hierarchical RL, such as asymmetric switch costs between changes in higher-level versus lower-level features, faster learning in higher-valued compared to lower-valued contexts, and preference for higher-valued compared to lower-valued contexts. We replicated these results across three independent samples. We simulated three models—a classic RL, a hierarchical RL, and a hierarchical Bayesian model—and compared their behavior to human results. While the flat RL model captured some aspects of participants’ sensitivity to outcome values, and the hierarchical Bayesian model captured some markers of transfer, only hierarchical RL accounted for all patterns observed in human behavior. This work shows that hierarchical RL, a biologically inspired and computationally simple algorithm, can capture human behavior in complex, hierarchical environments and opens the avenue for future research in this field.



中文翻译:

用于人类的层次结构强化学习的计算证据[学术论文(免费在线)]

在复杂多变的世界中,人类具有令人着迷的目标实现能力,在灵活性和学习速度方面仍超过现代机器学习算法。人们普遍认为,此功能的关键因素是使用抽象的分层表示形式,该表示形式采用环境中的结构来指导学习和决策。但是,我们对如何创建和使用这些层次表示形式的了解很少。这项研究提供了证据,表明人类行为可以描述为等级强化学习(RL)。我们设计了一个实验,用于在基于上下文的学习领域中使用一系列子任务来测试分层RL的特定预测,并观察了分层RL的几个行为标记,例如高层特征与低层特征之间的不对称转换成本,与低值上下文相比在高价值上下文中的学习速度更快,与低值上下文相比对高价值上下文的偏好。我们在三个独立的样本中复制了这些结果。我们模拟了三种模型-经典RL,分层RL和分层贝叶斯模型-并将其行为与人工结果进行了比较。平面RL模型捕获了参与者对结果值敏感性的某些方面,而分层贝叶斯模型捕获了转移的一些标记,但只有分层RL才解释了人类行为中观察到的所有模式。这项工作表明,分层RL是一种受生物学启发和计算简单的算法,可以捕获复杂的人类行为,

更新日期:2020-11-25
down
wechat
bug