当前位置: X-MOL 学术Topics in Cognitive Science › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Reinforcement Learning for Production-Based Cognitive Models
Topics in Cognitive Science ( IF 2.9 ) Pub Date : 2021-06-09 , DOI: 10.1111/tops.12546
Adrian Brasoveanu 1 , Jakub Dotlačil 2
Affiliation  

Production-based cognitive models, such as Adaptive Control of Thought-Rational (ACT-R) or Soar agents, have been a popular tool in cognitive science to model sequential decision processes. While the models have been useful in articulating assumptions and predictions of various theories, they unfortunately require a significant amount of hand coding, both with respect to what building blocks cognitive processes should consist of and with respect to how these building blocks are selected and ordered in a sequential decision process. Hand coding of large, realistic models poses a challenge for modelers, and also makes it unclear whether the models can be learned and are thus cognitively plausible. The learnability issue is probably most starkly present in cognitive models of linguistic skills, since linguistic skills involve richly structured representations and highly complex rules. We investigate how reinforcement learning (RL) methods can be used to solve the production selection and production ordering problem in ACT-R. We focus on four algorithms from the urn:x-wiley:17568757:media:tops12546:tops12546-math-0001 learning family, tabular urn:x-wiley:17568757:media:tops12546:tops12546-math-0002 and three versions of deep urn:x-wiley:17568757:media:tops12546:tops12546-math-0003 networks (DQNs), as well as the ACT-R utility learning algorithm, which provides a baseline for the urn:x-wiley:17568757:media:tops12546:tops12546-math-0004 algorithms. We compare the performance of these five algorithms in a range of lexical decision (LD) tasks framed as sequential decision problems. We observe that, unlike the ACT-R baseline, the urn:x-wiley:17568757:media:tops12546:tops12546-math-0005 agents learn even the more complex LD tasks fairly well. However, tabular urn:x-wiley:17568757:media:tops12546:tops12546-math-0006 and DQNs show a trade-off between speed of learning, applicability to more complex tasks, and how noisy the learned rules are. This indicates that the ACT-R subsymbolic system for procedural memory could be improved by incorporating more insights from RL approaches, particularly the function-approximation-based ones, which learn and generalize effectively in complex, more realistic tasks.

中文翻译:

基于生产的认知模型的强化学习

基于生产的认知模型,例如思想理性自适应控制 (ACT-R) 或 Soar 代理,已成为认知科学中用于对顺序决策过程进行建模的流行工具。虽然这些模型在阐明各种理论的假设和预测方面很有用,但不幸的是,它们需要大量的手工编码,无论是关于认知过程应该包含哪些构建块,以及如何选择和排序这些构建块一个顺序决策过程。大型现实模型的手工编码给建模者带来了挑战,并且还不清楚模型是否可以学习并因此在认知上是合理的。易学性问题可能在语言技能的认知模型中最为明显,因为语言技能涉及丰富的结构化表示和高度复杂的规则。我们研究了如何使用强化学习 (RL) 方法来解决 ACT-R 中的生产选择和生产订购问题。我们专注于四种算法urn:x-wiley:17568757:media:tops12546:tops12546-math-0001学习系列、表格urn:x-wiley:17568757:media:tops12546:tops12546-math-0002和三个版本的深度urn:x-wiley:17568757:media:tops12546:tops12546-math-0003网络 (DQN),以及 ACT-R 实用学习算法,它为算法提供了基线urn:x-wiley:17568757:media:tops12546:tops12546-math-0004。我们比较了这五种算法在一系列被定义为顺序决策问题的词法决策 (LD) 任务中的性能。我们观察到,与 ACT-R 基线不同,骨灰盒:x-wiley:17568757:媒体:tops12546:tops12546-math-0005代理甚至可以很好地学习更复杂的 LD 任务。然而,表格urn:x-wiley:17568757:media:tops12546:tops12546-math-0006和 DQN 显示了学习速度、对更复杂任务的适用性以及学习规则的嘈杂程度之间的权衡。这表明程序记忆的 ACT-R 子符号系统可以通过结合 RL 方法的更多见解来改进,特别是基于函数近似的方法,这些方法可以在复杂、更现实的任务中有效地学习和概括。
更新日期:2021-07-13
down
wechat
bug