当前位置: X-MOL 学术Psychophysiology › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
The reward positivity reflects the integrated value of temporally threefold‐layered decision outcomes
Psychophysiology ( IF 2.9 ) Pub Date : 2021-02-28 , DOI: 10.1111/psyp.13789
Lena Rommerskirchen 1 , Leon Lange 1 , Roman Osinsky 1
Affiliation  

In reinforcement learning, adaptive behavior depends on the ability to predict future outcomes based on previous decisions. The Reward Positivity (RewP) is thought to encode reward prediction errors in the anterior midcingulate cortex (aMCC) whenever these predictions are violated. Although the RewP has been extensively studied in the context of simple binary (win vs. loss) reward processing, recent studies suggest that the RewP scales complex feedback in a fine graded fashion. The aim of this study was to replicate and extend previous findings that the RewP reflects the integrated sum of instantaneous and delayed consequences of a singular outcome by increasing the feedback information content by a third temporal dimension. We used a complex reinforcement‐learning task where each option was associated with an immediate, intermediate and delayed monetary outcome and analyzed the RewP in the time domain as well as fronto‐medial theta power in the time‐frequency domain. To test if the RewP sensitivity to the three outcome dimensions reflect stable trait‐like individual differences in reward processing, a retesting session took place 3 months later. The results confirm that the RewP reflects the integrated value of complex temporally extended consequences in a stable manner, albeit there was no relation to behavioral choice. Our findings indicate that the medial frontal cortex receives fine graded information about complex action outcomes that, however, may not necessarily translate to cognitive or behavioral control processes.

中文翻译:

奖励积极性反映了时间三层决策结果的综合价值

在强化学习中,自适应行为取决于基于先前决策预测未来结果的能力。每当违反这些预测时,奖励积极性 (RewP) 被认为会编码前扣带回皮层 (aMCC) 中的奖励预测错误。尽管 RewP 已在简单的二元(赢与输)奖励处理的背景下进行了广泛的研究,但最近的研究表明,RewP 以精细分级的方式缩放了复杂的反馈。本研究的目的是复制和扩展先前的发现,即 RewP 通过将反馈信息内容增加第三个时间维度来反映单一结果的瞬时和延迟后果的综合总和。我们使用了一个复杂的强化学习任务,其中每个选项都与一个即时的、中间和延迟的货币结果,并分析了时域中的 RewP 以及时频域中的额内侧 theta 功率。为了测试 RewP 对三个结果维度的敏感性是否反映了奖励处理中稳定的特质状个体差异,3 个月后进行了重新测试。结果证实,RewP 以稳定的方式反映了复杂的时间扩展后果的综合价值,尽管与行为选择无关。我们的研究结果表明,内侧额叶皮层接收关于复杂动作结果的精细分级信息,然而,这些信息可能不一定转化为认知或行为控制过程。为了测试 RewP 对三个结果维度的敏感性是否反映了奖励处理中稳定的特质状个体差异,3 个月后进行了重新测试。结果证实,RewP 以稳定的方式反映了复杂的时间扩展后果的综合价值,尽管与行为选择无关。我们的研究结果表明,内侧额叶皮层接收关于复杂动作结果的精细分级信息,然而,这些信息可能不一定转化为认知或行为控制过程。为了测试 RewP 对三个结果维度的敏感性是否反映了奖励处理中稳定的特质状个体差异,3 个月后进行了重新测试。结果证实,RewP 以稳定的方式反映了复杂的时间扩展后果的综合价值,尽管与行为选择无关。我们的研究结果表明,内侧额叶皮层接收到关于复杂动作结果的精细分级信息,然而,这些信息可能不一定转化为认知或行为控制过程。尽管与行为选择无关。我们的研究结果表明,内侧额叶皮层接收到关于复杂动作结果的精细分级信息,然而,这些信息可能不一定转化为认知或行为控制过程。尽管与行为选择无关。我们的研究结果表明,内侧额叶皮层接收关于复杂动作结果的精细分级信息,然而,这些信息可能不一定转化为认知或行为控制过程。
更新日期:2021-04-15
down
wechat
bug