Randomized matrix games in a finite population: Effect of stochastic fluctuations in the payoffs on the evolution of cooperation

doi:10.1016/j.tpb.2020.04.006

Theoretical Population Biology

Volume 134, August 2020, Pages 77-91

https://doi.org/10.1016/j.tpb.2020.04.006 Get rights and content

Abstract

A diffusion approximation for a randomized 2 × 2-matrix game in a large finite population is ascertained in the case of random payoffs whose expected values, variances and covariances are of order given by the inverse of the population size $N$ . Applying the approximation to a Randomized Prisoner’s Dilemma (RPD) with independent payoffs for cooperation and defection in random pairwise interactions, conditions on the variances of the payoffs for selection to favor the evolution of cooperation, favor more the evolution of cooperation than the evolution of defection, and disfavor the evolution of defection are deduced. All these are obtained from probabilities of ultimate fixation of a single mutant. It is shown that the conditions are lessened with an increase in the variances of the payoffs for defection against cooperation and defection and a decrease in the variances of the payoffs for cooperation against cooperation and defection. A RPD game with independent payoffs whose expected values are additive is studied in detail to support the conclusions. Randomized matrix games with non-independent payoffs, namely the RPD game with additive payoffs for cooperation and defection based on random cost and benefit for cooperation and the repeated RPD game with Tit-for-Tat and Always-Defect as strategies in pairwise interactions with a random number of rounds, are studied under the assumption that the population-scaled expected values, variances and covariances of the payoffs are all of the same small enough order. In the first model, the conditions in favor of the evolution of cooperation hold only if the covariance between the cost and the benefit is large enough, while the analysis of the second model extends the results on the effects of the variances of the payoffs for cooperation and defection found for the one-round RPD game.

Introduction

Cooperative behavior is a phenomenon that is widely observed in nature. However, natural selection tends to enhance selfish behavior through fierce competition. In order to explain the rationality of cooperation and its evolution in natural populations, a two-player game known as the Prisoner’s Dilemma (PD) has been widely studied as one of the most important theoretical frameworks (Axelrod and Hamilton, 1981, Maynard Smith, 1982, Axelrod, 1984, Poundstone, 1992, Nowak and Highfield, 2011). In an additive version of the PD game, cooperation takes the form of a donor who pays a cost $c$ for a recipient to get a benefit $b$ . Defection costs nothing and does not disqualify from receiving a benefit. Therefore, the payoff for cooperation never exceeds the payoff for defection (Nowak, 2006, Nowak and Sigmund, 2007). This is the case in more general versions of the PD game. Moreover, assuming random pairwise interactions in an infinite population and average payoffs as relative growth rates, the replicator equation (Taylor and Jonker, 1978) predicts global convergence to fixation of defection (Hofbauer and Sigmund, 1998).

In a finite population of constant size $N$ undergoing discrete, non overlapping generations according to a Wright–Fisher model and more general models with exchangeable reproduction schemes (Fisher, 1930, Wright, 1931, Cannings, 1974, Ewens, 2004, Lessard, 2011), the fixation probability for a neutral mutant type represented only once initially is just the inverse of the population size, that is, $N^{- 1}$ . If this probability becomes larger than $N^{- 1}$ in the presence of selection, then the mutant type has been said to be favored by selection (Nowak et al., 2004). Several mechanisms have been considered to explain how cooperation could be favored by natural selection assuming additive effects of average payoffs on fitness (Nowak and Sigmund, 2007). This is the case, for instance, for cooperation taking the form of the “Tit-for-Tat” strategy (Trivers, 1971, Axelrod and Hamilton, 1981, Axelrod, 1984) starting with cooperation in a repeated PD game between randomly chosen partners if the number of rounds exceeds some threshold value (Nowak et al., 2004). This is also the case in group-structured or graph-structured populations for modeling some social or geographical networks with local interactions (Ohtsuki et al., 2006). However, with a one-round PD game and constant payoffs in a well-mixed population, the fitness of cooperation never exceeds the fitness of defection, and, as a result, cooperation cannot be favored by selection.

In nature, there are changes not only in the composition of a population but also in the surrounding environment in which the population finds itself. These can affect the payoffs that individuals receive as a result of interactions with others. Randomness in evolutionary games can take several forms such as probabilistic encounter rules or mixed strategies depending or not on the replies of others (Taylor and Jonker, 1978, Eshel and Cavalli-Sforza, 1982, Hofbauer and Sigmund, 1998). Of particular interest are stochastic games which allow the environment to change in response to the players’ choices (Shapley, 1953, Fudenberg et al., 2012, Solan and Vieille, 2015, Hilbe et al., 2018). But also not to be forgotten are variations in payoffs caused by disturbances in the natural environment. These can be periodic, e.g., being seasonal or alternating day and night. But they can also be totally random as if occurring by accident (May, 1973, Kaplan et al., 1990, Lande et al., 2003). In the case of deterministically time-dependent payoffs in $2 \times 2$ matrix games, for instance, Broom (2005) compares the time average of the population state and the interior Nash equilibrium of the average payoff matrix and shows that they can be arbitrarily far apart. With periodic payoffs, even stable periodic orbits can be found from arbitrary starting points (Uyttendaele et al., 2012). On the other hand, it is shown in Stollmeier and Nagler (2018) that under the effects of random environmental noise, an evolutionary game involving two strategies with a strategy having a higher expected payoff at any frequency than the other can reach a stationary distribution with both strategies co-existing.

In a matrix game, unless stochastic fluctuations in the environment are small enough to be ignored, it is more accurate to use random payoffs than constant payoffs. In particular, the introduction of random payoffs extends the classical PD game to a randomized PD game. In order to reveal how environmental noise can generally affect the evolutionary game dynamics in an infinite population, the concepts of stochastic evolutionary stability (SES) and stochastic convergence stability (SCS) have been investigated (Zheng et al., 2017, Zheng et al., 2018). Applying these concepts to a one-round randomized PD game in a well-mixed population, it can be shown that the evolution of cooperation tends to be more easily favored by natural selection if the coefficients of variation of the payoffs are smaller for cooperation than for defection (Li et al., 2019).

On the other hand, in a population genetics framework for a large finite population, Karlin and Levikson (1974) have shown that, when the mean and variance of frequency-independent genotypic fitnesses are of the same order given by the inverse of the population size, the effect of the variance matters. Actually, variability in selection, meaning fluctuating selection intensities, produces a “drift effect” away from the fixation states.

In order to study the effect of stochastic fluctuations in a context of an evolutionary game in a large finite population, we consider in this paper a matrix game with random payoffs for two players using one of two strategies. After ascertaining a diffusion approximation for this model, we focus on the Randomized Prisoner’s Dilemma (RPD) with cooperation and defection as strategies, and we consider the probability of ultimate fixation of either strategy as a single mutant. Conditions that favor the evolution of cooperation are examined in detail in the case of independent payoffs such that the average effects of cooperation and defection are additive. A RPD game with random additive effects of cooperation and defection on the payoffs as well as a repeated RPD game are also studied.

Section snippets

The model

We consider a randomized matrix game with two strategies in a finite population of fixed finite size $N$ . The two possible pure strategies used by the individuals in the population are denoted by $S_{1}$ and $S_{2}$ . At time $t \geq 0$ corresponding to some generation, the frequencies of $S_{1}$ and $S_{2}$ are given by $x (t)$ and $1 - x (t)$ , respectively, while their payoffs in pairwise interactions are given by the entries of the 2 × 2 random game matrix $(\begin{pmatrix} η_{1} (t) & η_{2} (t) \\ η_{3} (t) & η_{4} (t) \end{pmatrix}) .$ Here, $η_{1} (t)$ and $η_{2} (t)$ are the payoffs to strategy $S_{1}$

Diffusion approximation

Let $Δ x = x (t + Δ t) - x (t)$ be the change in the frequency of individuals that use strategy $S_{1}$ from time $t$ to time $t + Δ t$ . Given $x (t) = x$ , the first, second and fourth moments of $Δ x$ can be calculated as (see Appendix A for details) $E (Δ x | x (t) = x) = m (x) Δ t + o (Δ t),$ $E ({(Δ x)}^{2} | x (t) = x) = v (x) Δ t + o (Δ t)$ and $E ({(Δ x)}^{4} | x (t) = x) = o (Δ t),$ where $m (x) = x (1 - x) (μ_{2} - μ_{4} + x (μ_{1} - μ_{2} - μ_{3} + μ_{4}) + x^{3} (σ_{13} - σ_{1}^{2}) + x {(1 - x)}^{2} (2 σ_{34} - σ_{14} - σ_{23} + σ_{24} - σ_{2}^{2}) + x^{2} (1 - x) (- 2 σ_{12} + σ_{14} + σ_{23} - σ_{13} + σ_{3}^{2}) + {(1 - x)}^{3} (σ_{4}^{2} - σ_{24}))$ and $v (x) = x (1 - x) (1 + x^{3} (1 - x) (σ_{1}^{2} + σ_{3}^{2} - 2 σ_{13}) + x {(1 - x)}^{3} (σ_{2}^{2} + σ_{4}^{2} - 2 σ_{24}) + 2 x^{2} (1 - x$

Randomized Prisoner’s Dilemma (RPD)

Consider a random game matrix (1) with independent payoffs whose expected values determine a classical Prisoner’s Dilemma (PD). In this case, the population-scaled parameters in (3) verify $σ_{i j} = 0$ for $i, j = 1, \dots, 4$ with $i \neq j$ , since the payoffs are uncorrelated, and $(\begin{pmatrix} μ_{1} & μ_{2} \\ μ_{3} & μ_{4} \end{pmatrix}) = (\begin{pmatrix} R & S \\ T & P \end{pmatrix})$ with $T > R > P > S$ and $2 R > T + S$ , which defines a PD game. Then, we have a randomized Prisoner’s Dilemma (RPD) with strategies $S_{1}$ and $S_{2}$ corresponding to cooperation ( $C$ ) and defection ( $D$ ), respectively.

Suppose that cooperation is

RPD with independent payoffs

In this section, we focus on a RPD game with independent payoffs whose expected values are such that $(\begin{pmatrix} μ_{1} & μ_{2} \\ μ_{3} & μ_{4} \end{pmatrix}) = (\begin{pmatrix} b - c & - c \\ b & 0 \end{pmatrix}) .$ This payoff matrix determines an additive PD game in which cooperation ( $C$ ) incurs a fixed cost $c > 0$ to the individual adopting it, but provides a fixed benefit $b > 0$ to the opponent, while defection ( $D$ ) incurs no cost at all.

In this case, the function $h (x)$ in (34) is given by $h (x) = - c$ . Moreover, if $c \leq 1$ , then it can be shown (see Appendix B for details) that the function $g (x)$ in (27)

RPD with additive payoffs

In this section, we consider a RPD game with additive payoffs. At time $t \geq 0$ , cooperation ( $C$ ) incurs a random cost $c (t) > 0$ to the individual adopting it, but provides a random benefit $b (t) > 0$ to the opponent, while defection ( $D$ ) incurs no cost at all, so that the random payoff matrix takes the form $(\begin{pmatrix} η_{1} (t) & η_{2} (t) \\ η_{3} (t) & η_{4} (t) \end{pmatrix}) = (\begin{pmatrix} b (t) - c (t) & - c (t) \\ b (t) & 0 \end{pmatrix}) .$ The main difference with the model in the previous section is that the payoffs are not independent.

Here, $c (t)$ and $b (t)$ are assumed to be random variables with $E (b ($

Repeated RPD

We turn now our attention to a RPD game that is repeated a random number of times. There are two pure actions, cooperation ( $C$ ) and defection ( $D$ ), and the payoffs in a single round of interaction between two players at time $t \geq 0$ are given by the random game matrix $(\begin{pmatrix} R (t) & S (t) \\ T (t) & P (t) \end{pmatrix}) .$ Here, $R (t)$ and $S (t)$ are the payoffs to action $C$ against $C$ and $D$ , respectively, while $T (t)$ and $P (t)$ are the corresponding payoffs to action $D$ against the same two actions. These payoffs are assumed to be independent

Discussion

Environmental noise in the payoffs of a matrix game may have important effects on the evolutionary dynamics, and even change the outcome of evolution. As a matter of fact, the dynamics is driven not only by the expected values of the payoffs but also by their variances. Variability in payoffs can push the time average of a population state far from its interior Nash equilibrium (Broom, 2005) or even change the stability of a fixation state (Stollmeier and Nagler, 2018). In the case of a

Acknowledgments

This research was supported in part by NSERC of Canada (Grant no. 8833) and Chinese Academy of Sciences President’s International Fellowship Initiative (Grant no. 2016VBA039). We thank two anonymous referees for helpful comments to improve this paper.

References (35)

BroomM.
Evolutionary games with variable payoffs
C. R. Biol.
(2005)
KarlinS. et al.
Temporal fluctuations in selection intensities: case of small population size
Theor. Popul. Biol.
(1974)
LessardS.
Long-term stability from fixation probabilities in finite populations: New perspectives for ESS theory
Theor. Popul. Biol.
(2005)
AxelrodR.
The Evolution of Cooperation
(1984)
AxelrodR. et al.
The evolution of cooperation
Science
(1981)
CanningsC.
The latent roots of certain Markov chains arising in genetics: A new approach, I. Haploid models
Adv. Appl. Probab.
(1974)
EshelI. et al.
Assortment of encounters and evolution of cooperativeness
Proc. Natl. Acad. Sci. USA
(1982)
EwensW.J.
Mathematical Population Genetics: I Theoretical Introduction
(2004)
FisherR.A.
The Genetical Theory of Natural Selection
(1930)
FudenbergD. et al.
Slow to anger and fast to forgive: Cooperation in an uncertain world
Amer. Econ. Rev.
(2012)

HilbeC. et al.

Evolution of cooperation in stochastic games

Nature

(2018)

HofbauerJ. et al.

The Theory of Evolution and Dynamical Systems

(1998)

KaplanH. et al.

KimuraM.

Diffusion models in population genetics

J. Appl. Probab.

(1964)

LandeR. et al.

Stochastic Population Dynamics in Ecology and Conservation

(2003)

LessardS.

LiC. et al.

Uncertainty in payoffs for defection could be conducive to the evolution of cooperative behavior

(2019)

Cited by (11)

The emergence of cooperative behavior based on random payoff and heterogeneity of concerning social image
2024, Chaos, Solitons and Fractals
The heterogeneity among individuals in a group usually leads to the volatility of the individual’s income in the game and the difference in the attention paid to the individual’s social image. This paper proposes a spatial prisoner’s dilemma model with random fluctuations of temptation returns and the social images of individuals considered. The payoff of a defector in the game with a cooperator is set as a random variable. And the social image value of an individual who holds the cooperative strategy is set to a positive value, while the social image value of an individual who holds defective strategy is set to zero. Some individuals in the group care about the social images, while others do not. Social image values and benefits are combined into the fitness of individual. The learning strategy adopted is the combination of optimal fitness and Fermi rule based on return difference.
Numerical experiments indicate that the volatility of the temptation value is beneficial for cooperation under low temptation, but the volatility of the defector’s income is unfavorable for cooperation under high temptation. And the volatility increases the complexity of system evolution, thereby prolonging the evolution time required to reach steady state. The presence of individuals in the system who care about social image can promote cooperation under low temptation, but the proportion of individuals concerned with social image has little effect on the cooperation behavior. When the social image value of individuals with cooperative strategy increases, the cooperative behavior characteristics of the system do not change qualitatively.
Evolution of cooperation with respect to fixation probabilities in multi-player games with random payoffs
2022, Theoretical Population Biology
Citation Excerpt :
A stochastic version of the continuous-time replicator equation with a random noise added to the growth rate of every strategy was considered in Fudenberg and Harris (1992). Recently, the effect of stochastic changes in payoffs in discrete time was studied with particular attention to stochastic local stability of fixation states and constant polymorphic equilibria in an infinite population (Zheng et al., 2017, 2018), the fixation probability in a large population that reproduces according to a Wright–Fisher model (Li and Lessard, 2020) and the average abundance of strategies in a finite population that reproduces according to a Moran model (Kroumi and Lessard, 2021a,b). In general, variability in cost and benefit in social dilemmas introduces variances and covariances in payoffs for cooperation and defection.
We study the effect of variability in payoffs on the evolution of cooperation ( $C$ ) against defection ( $D$ ) in multi-player games in a finite well-mixed population. We show that an increase in the covariance between any two payoffs to $D$ , or a decrease in the covariance between any two payoffs to $C$ , increases the probability of ultimate fixation of $C$ when represented once, and decreases the corresponding fixation probability for $D$ . This is also the case with an increase in the covariance between any payoff to $C$ and any payoff to $D$ if and only if the sum of the numbers of $C$ -players in the group associated with these payoffs is large enough compared to the group size. In classical social dilemmas with random cost and benefit for cooperation, the evolution of $C$ is more likely to occur if the variances of the cost and benefit, as well as the group size, are small, while the covariance between cost and benefit is large.
Inclusive fitness and Hamilton's rule in a stochastic environment
2021, Theoretical Population Biology
Citation Excerpt :
See also Taylor et al. (2007), Lessard (2009) for further results on inclusive fitness in finite structured populations. As for environmental stochasticity, refer to Gillespie (1973), Karlin and Levikson (1974), Karlin and Liberman (1974) for early contributions on the effect of between-generation variance in selection parameters, and McNamara (1995), Zheng et al. (2017), Li and Lessard (2020) for more recent studies in the context of evolutionary game theory. This paper deals with implications of these studies on inclusive fitness theory and Hamilton’s rule.
The evolution of cooperation in Prisoner’s Dilemmas with additive random cost and benefit for cooperation cannot be accounted for by Hamilton’s rule based on mean effects transferred from recipients to donors weighted by coefficients of relatedness, which defines inclusive fitness in a constant environment. Extensions that involve higher moments of stochastic effects are possible, however, and these are connected to a concept of random inclusive fitness that is frequency-dependent. This is shown in the setting of pairwise interactions in a haploid population with the same coefficient of relatedness between interacting players. In an infinite population, fixation of cooperation is stochastically stable if a mean geometric inclusive fitness of defection when rare is negative, while fixation of defection is stochastically unstable if a mean geometric inclusive fitness of cooperation when rare is positive, and these conditions are generally not equivalent. In a finite population, the probability for cooperation to ultimately fix when represented once exceeds the probability under neutrality or the corresponding probability for defection if the mean inclusive fitness of cooperation when its frequency is $1 / 3$ or $1 / 2$ , respectively, exceeds 1. All these results rely on the simplifying assumption of a linear fitness function. It is argued that meaningful applications of random inclusive fitness in complex settings (multi-player game, diploidy, population structure) would generally require conditions of weak selection and additive gene action.
The effect of variability in payoffs on average abundance in two-player linear games under symmetric mutation
2021, Journal of Theoretical Biology
Citation Excerpt :
We have shown that the average abundance of a strategy in the stationary state is driven not only by the means of the payoffs but also by their variances and covariances. These results agree with the fact that conditions for selection to favor the evolution of cooperation, favor more the evolution of cooperation than the evolution of defection, and disfavor the evolution of defection, all with respect to fixation probability in the absence of mutation, are lessened with an increase in the variances of the payoffs for defection and a decrease in the variances of the payoffs for cooperation (Li and Lessard, 2020). They agree also with the fact that the evolution of cooperation tends to be favored by selection in a large population with respect to the concepts of stochastic local stability (SLS) and stochastic evolutionary stability (SES) applied to Prisoner’s Dilemmas with random payoffs (Zheng et al., 2017; Zheng et al., 2018) if the coefficients of variation of the payoffs are smaller for cooperation than for defection (Li et al., 2020).
Classical studies in evolutionary game theory assume constant payoffs. Randomly fluctuating environments in real populations make this assumption idealistic. In this paper, we study randomized two-player linear games in a finite population in a succession of birth-death events according to a Moran process and in the presence of symmetric mutation. Introducing identity measures under neutrality that depend on the mutation rate and calculating these in the limit of a large population size by using the coalescent process, we study the first-order effect of the means, variances and covariances of the payoffs on average abundance in the stationary state under mutation and selection. This shows how the average abundance of a strategy is driven not only by its mean payoffs but also by the variances and covariances of its payoffs. In Prisoner’s Dilemmas with additive cost and benefit for cooperation, where constant payoffs always favor the abundance of defection, stochastic fluctuations in the payoffs can change the strategy that is more abundant on average in the stationary state. The average abundance of cooperation is increased if the variance of any payoff to cooperation against cooperation or defection, or their covariance, is decreased, or if the variance of any payoff to defection against cooperation or defection, or their covariance, is increased. This is also the case for a Prisoner’s Dilemma with independent payoffs that is repeated a random number of times. As for the mutation rate, it comes into play in the coefficients of the variances and covariances that determine average abundance. Increasing the mutation rate can enhance or lessen the condition for a strategy to be more abundant on average than another.
The effect of the opting-out strategy on conditions for selection to favor the evolution of cooperation in a finite population
2021, Journal of Theoretical Biology
We consider a Prisoner’s Dilemma (PD) that is repeated with some probability $1 - ρ$ only between cooperators as a result of an opting-out strategy adopted by all individuals. The population is made of N pairs of individuals and is updated at every time step by a birth–death event according to a Moran model. Assuming an intensity of selection of order $1 / N$ and taking $2 N^{2}$ birth–death events as unit of time, a diffusion approximation exhibiting two time scales, a fast one for pair frequencies and a slow one for cooperation (C) and defection (D) frequencies, is ascertained in the limit of a large population size. This diffusion approximation is applied to an additive PD game, cooperation by an individual incurring a cost c to the individual but providing a benefit b to the opponent. This is used to obtain the probability of ultimate fixation of C introduced as a single mutant in an all D population under selection, which can be compared to the probability under neutrality, $1 / (2 N)$ , as well as the corresponding probability for a single D introduced in an all C population under selection. This gives conditions for cooperation to be favored by selection. We show that these conditions are satisfied when the benefit-to-cost ratio, $b / c$ , exceeds some increasing function of $ρ$ that is approximately given by $(1 + \sqrt{ρ}) / (1 - \sqrt{ρ})$ . This condition is more stringent, however, than the condition for tit-for-tat (TFT) to be favored against always-defect (AllD) in the absence of opting-out.
Stochastic viability in an island model with partial dispersal: Approximation by a diffusion process in the limit of a large number of islands
2023, arXiv

View all citing articles on Scopus

View full text

Randomized matrix games in a finite population: Effect of stochastic fluctuations in the payoffs on the evolution of cooperation

Abstract

Introduction

Section snippets

The model

Diffusion approximation

Randomized Prisoner’s Dilemma (RPD)

RPD with independent payoffs

RPD with additive payoffs

Repeated RPD

Discussion

Acknowledgments

C. R. Biol.

Theor. Popul. Biol.

Theor. Popul. Biol.

The Evolution of Cooperation

The evolution of cooperation

Science

The latent roots of certain Markov chains arising in genetics: A new approach, I. Haploid models

Adv. Appl. Probab.

Assortment of encounters and evolution of cooperativeness

Proc. Natl. Acad. Sci. USA

Mathematical Population Genetics: I Theoretical Introduction

The Genetical Theory of Natural Selection

Slow to anger and fast to forgive: Cooperation in an uncertain world

Amer. Econ. Rev.