Adaptation of utility functions to reward distribution in rhesus monkeys,bioRxiv - Animal Behavior and Cognition

当前位置： X-MOL 学术 › bioRxiv. Anim. Behav. Cognit. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Adaptation of utility functions to reward distribution in rhesus monkeys
bioRxiv - Animal Behavior and Cognition Pub Date : 2020-05-25 , DOI: 10.1101/2020.05.22.110213
Philipe M. Bujold , Simone Ferrari-Toniolo , Wolfram Schultz

This study investigated the influence of experienced reward distributions on the shape of utility func-tions inferred from economic choice. Utility is the hypothetical variable that appears to be maximized by the choice. Despite the generally accepted notion that utility functions are not insensitive to external references, the exact occurrence of such changes remains largely unknown. Here we benefitted from the capacity to perform thorough and extensive experimental tests of one of our evolutionary closest, experimentally viable and intuitively understandable species, the rhesus macaque monkey. Data from thousands of binary choices demonstrated that the animals' preferences changed dependent on the sta-tistics of recently experienced rewards and adapted to future expected rewards. The elicited utility functions shifted and extended their shape with several months of changes in the mean and range of reward distributions. However, the adaptations were usually not complete, suggesting that past expe-riences remained present when anticipating future rewards. Through modelling, we found that rein-forcement learning provided a strong basis for explaining these adaptations. Thus, rather than having stable and fixed preferences assumed by normative economic models, rhesus macaques flexibly shaped their preferences to optimize decision-making according to the statistics of the environment.

中文翻译：

调整效用函数以奖励恒河猴的分布

这项研究调查了经验奖励分配对从经济选择推断出的效用函数形状的影响。效用是通过选择似乎最大化的假设变量。尽管普遍接受的效用函数对外部引用不敏感的观念，但这种变化的确切发生在很大程度上仍然未知。在这里，我们受益于对我们进化上最接近，实验上可行且直观易懂的物种之一的恒河猴进行彻底而广泛的实验测试的能力。来自数千种二元选择的数据表明，动物的喜好根据最近经历的奖励的统计数据而变化，并适应未来的预期奖励。所引发的效用函数随着奖励分配的平均值和范围的变化而变化，并扩展了它们的形状。但是，改编通常是不完整的，这表明在预期未来的回报时，过去的经验仍然存在。通过建模，我们发现强化学习为解释这些适应提供了坚实的基础。因此，恒河猴并非具有规范经济模型所假定的稳定和固定的偏好，而是根据环境的统计信息灵活地塑造其偏好以优化决策。我们发现，强化学习为解释这些适应提供了坚实的基础。因此，恒河猴并非具有规范经济模型所假定的稳定和固定的偏好，而是根据环境的统计信息灵活地塑造其偏好以优化决策。我们发现，强化学习为解释这些适应提供了坚实的基础。因此，恒河猴并非具有规范经济模型所假定的稳定和固定的偏好，而是根据环境的统计信息灵活地塑造其偏好以优化决策。

更新日期：2020-05-25

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文

全部期刊列表>>