当前位置: X-MOL 学术Simul. Model. Pract. Theory › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Modeling behavioral experiments on uncertainty and cooperation with population-based reinforcement learning
Simulation Modelling Practice and Theory ( IF 4.2 ) Pub Date : 2021-03-03 , DOI: 10.1016/j.simpat.2021.102299
Elias Fernández Domingos , Jelena Grujić , Juan C. Burguillo , Francisco C. Santos , Tom Lenaerts

From climate action to public health measures, human collective endeavors are often shaped by different uncertainties. Here we introduce a novel population-based learning model wherein a group of individuals facing a collective risk dilemma acquire their strategies over time through reinforcement learning, while handling different sources of uncertainty. In such an N-person collective risk dilemma players make step-wise contributions to avoid a catastrophe that would result in a loss of wealth for all players. Success is attained if they collectively reach a certain contribution level over time, or, when the threshold is not reached, they were lucky enough to avoid the cataclysm. The dilemma lies in the trade-off between the proportion of personal contributions that players wish to give to collectively reach the goal and the remainder of the wealth they can keep at the end of the game. We show that the strategies learned with the model correspond to those experimentally observed, even when there is uncertainty about either the risk of failing when the goal is not reached, the magnitude of the threshold to attain and the time available to reach the target. We furthermore confirm that being unsure about the time-window favors more extreme reactions and polarization, diminishing the number of agents that contribute fairly. The population-based on-line learning framework we propose is general enough to be applicable in a wide range of collective action problems and arbitrarily large sets of available policies.



中文翻译:

基于不确定性和基于人口的强化学习合作的行为实验建模

从气候行动到公共卫生措施,人类的集体努力往往受不同的不确定因素影响。在这里,我们介绍了一种新颖的基于人群的学习模型,其中,面对集体风险困境的一群人通过强化学习逐步掌握了他们的策略,同时处理了各种不确定性来源。在这样一个N人的集体风险困境中,玩家做出了分阶段的贡献,以避免发生灾难,该灾难将导致所有玩家的财富损失。如果他们随着时间的推移共同达到一定的贡献水平,或者如果没有达到阈值,则他们很幸运地避免了大灾变,那么就可以获得成功。困境在于玩家希望集体贡献达到目标的个人贡献比例与他们在游戏结束时可以保留的剩余财富之间的权衡。我们表明,即使从不确定的角度来分析未达到目标的失败风险,要达到的阈值的大小以及可以达到目标的时间,使用该模型学习的策略也与实验观察到的策略相对应。我们进一步证实,不确定时间窗口会导致更极端的反应和两极分化,从而减少做出公平贡献的特工人数。我们提出的基于人群的在线学习框架具有足够的通用性,可适用于广泛的集体行动问题和任意数量的可用政策集。

更新日期:2021-03-09
down
wechat
bug