当前位置: X-MOL 学术Trends Cogn. Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
The computational roots of positivity and confirmation biases in reinforcement learning
Trends in Cognitive Sciences ( IF 16.7 ) Pub Date : 2022-05-31 , DOI: 10.1016/j.tics.2022.04.005
Stefano Palminteri 1 , Maël Lebreton 2
Affiliation  

Humans do not integrate new information objectively: outcomes carrying a positive affective value and evidence confirming one’s own prior belief are overweighed. Until recently, theoretical and empirical accounts of the positivity and confirmation biases assumed them to be specific to ‘high-level’ belief updates. We present evidence against this account. Learning rates in reinforcement learning (RL) tasks, estimated across different contexts and species, generally present the same characteristic asymmetry, suggesting that belief and value updating processes share key computational principles and distortions. This bias generates over-optimistic expectations about the probability of making the right choices and, consequently, generates over-optimistic reward expectations. We discuss the normative and neurobiological roots of these RL biases and their position within the greater picture of behavioral decision-making theories.



中文翻译:

强化学习中积极性和确认偏差的计算根源

人类不会客观地整合新信息:带有积极情感价值的结果和证实自己先前信念的证据被夸大了。直到最近,关于积极性和确认性偏差的理论和实证研究都假设它们是特定于“高级”信念更新的。我们提出了反对这一说法的证据。强化学习 (RL) 任务中的学习率,在不同的上下文和物种中估计,通常呈现出相同的特征不对称,这表明信念和价值更新过程共享关键的计算原则和扭曲。这种偏见会产生对做出正确选择的概率的过度乐观预期,因此会产生过度乐观的奖励预期。

更新日期:2022-05-31
down
wechat
bug