1 Introduction

Individuals engage with myriad public goods in their daily lives, ranging from environmental quality to human services to public spaces and public art, and they may contribute to these various causes through monetary donations, in-kind gifts, and volunteerism. This diverse landscape presents private citizens with a rich choice set for public good contributions, but it also creates challenges for coordination and cost-effective allocation of resources. Given vast heterogeneity in public goods and technologies for augmenting public goods, there is latitude for misallocation of resources across causes (Chan & Wolk, 2020). These issues have become even more pronounced in recent times, as online crowdfunding and patronage platforms offer a growing menu of public good possibilities to a rapidly expanding base of contributors.

In this paper, we investigate the determinants of cost-(in)effective contributions to public goods, exploring factors at both the institutional and individual levels. At the institutional level, we examine whether patterns of giving are influenced by consequential uncertainty over the value of public good contributions. At the individual level, we consider risk and ambiguity attitudes, giving type as elicited through a charitable giving task, and common demographic covariates. To understand these relationships, we model behavior in a public good game that accounts for heterogeneity in giving types and social preferences. We derive from this model testable predictions, and we proceed to test these hypotheses in a pre-registered online experiment.

Our experimental design builds on that of Chan and Wolk (2020), which features a set of four simultaneous public good contribution decisions with different marginal per capita returns (MPCRs). Importantly, this multiple public good environment makes possible cost-ineffective contributions, as subjects may contribute at low MPCRs without exhausting all contribution possibilities at high MPCRs. We extend this framework to examine how individual and institutional factors affect the cost-effectiveness of contributions. In particular, we construct two treatments in which the MPCR of each public good is a random variable with known bounds. Importantly, bounds are non-overlapping across the four public goods, allowing us to construct a clean and novel measure of cost-ineffectiveness to test our hypotheses. In one treatment, the distributions for these random variables are known (Risky treatment), while in another treatment, the distributions are unknown (Ambiguity treatment); we compare these treatments to each other and to a control treatment in which MPCRs are certain (Certain treatment). We also include an array of additional tasks to elicit individual characteristics, allowing us to investigate the relationship between cost-ineffectiveness and demographics, risk and ambiguity attitudes, risk literacy, giving types, and attentiveness.

We construct a measure of cost-ineffectiveness that conditions on the subject’s total contribution. In this way, the measure focuses on how well the subject allocates the contribution across disparate public goods; the measure should not be directly affected by the subject’s cooperativeness or generosity and is thus well-suited for our subsequent analysis. We conduct a series of parametric and non-parametric tests to compare contribution behavior across treatments and across individuals. Cost-ineffective giving is present in all three treatments, indicating that subjects are not maximizing the impact and payoffs generated by their chosen level of contributions. Interestingly, total contributions and the degree of cost-ineffectiveness are nearly identical across treatments, which suggests that the information environment does not affect these margins of decision-making. Turning to individual characteristics, we find little evidence that risk or ambiguity attitudes affect cost-ineffectiveness. However, there are some differences across giving types. In particular, we find that individuals who are at least partially motivated by warm glow (i.e., pure warm-glow givers and impure altruists) contribute less cost-effectively than pure altruists and non-donors. This finding accords with predictions from theory, although we are the first, to our knowledge, to document this empirically. Our findings are robust for within- and between-subjects analyses.

Our experimental results provide important insights for the broader world. Individuals have always had many avenues for augmenting public goods, e.g., when facing multiple charitable causes. These choice sets continue to expand with modern crowdfunding and patronage platforms, making room for inefficient allocation of resources across public goods. Inefficiency may arise from individual or institutional factors, and understanding the influence of these different factors is crucial to improving public good provision. In many cases, there may be uncertainty about the productivity of different public good investments. For example, how well do investments in civic crowdfunding projects (e.g., urban greenspaces, public art installations, etc.) enhance social cohesion? How well do individual actions (e.g., wearing face masks, minimizing social contact) help blunt the spread of infectious diseases? There are also important questions about how public good contributions differ along dimensions of individual heterogeneity. What types of individuals are most likely to contribute in cost-(in)effective ways? Overall, our work provides novel and timely insights into both positive and normative aspects of public good provision.

Our work advances several distinct lines of inquiry in the economics of public good provision and charitable giving. In focusing on efficiency, our work ties in with discussions spurred by the Effective Altruism movement. Effective Altruists urge donors to give to charities that generate the greatest benefit per dollar donated. This idea has been a topic of substantial interest among philosophers and ethicists (MacAskill, 2016; Singer, 2015), but it also has natural intersections with efficiency and cost-effectiveness concerns long articulated by economists (Karlan & Wood, 2017; List, 2011). These principles have also gained traction outside of academia, e.g., in the form of organizations like Charity Navigator and GiveWell, which seek to quantify the social impact of dollars donated across charities and causes.

Yet, in spite of clear economic and ethical rationales for effective altruism, many individuals still appear unresponsive or inattentive to the effectiveness of their charitable efforts. Evidence from laboratory and field experiments shows that most donors do not increase their donations when provided with information about the effectiveness of the charity (Clark et al., 2018; Karlan & Wood, 2017). Metzger & Günther (2019) show that many participants in laboratory experiments are unwilling to purchase, even for a minimal fee, information on the efficiency of the charity they have been asked to donate to.Footnote 1 Instead, these participants show greater interest in purchasing information about the people that the charity will help, suggesting that efficiency is not a top priority. Along similar lines, Berman et al. (2018) report that, for most participants, emotional attachment to a charitable cause is more important than the effectiveness of the charity, which may explain why few people behave as effective altruists. Genç et al. (2020) show via a choice experiment that most donors place significantly greater weight on where a donation is spent (preferring it to be spent closer to home), while assigning less importance to the effectiveness of the donation or the needs of the recipient. In short, there is robust evidence that many donors show little interest in the efficiency of the charities to which they are donating.

We explore how this (in)attention to cost-effectiveness may relate to institutional and individual factors. Our investigation into individual characteristics contributes to a broader literature on how contribution behaviors differ across motivations for giving. An influential body of theory within economics separates motives for giving into two main types: pure altruists and warm-glow givers (Andreoni, 1989, 1990; Ribar & Wilhelm, 2002; Warr, 1982; Yildirim, 2014). Pure altruists derive utility from the total amount of the charitable good provided. To a pure altruist, her donation and the donations of others are perfect substitutes, leading to crowding out of charitable donations (Warr, 1982). In contrast, a warm-glow giver derives utility from the act of giving itself, thus removing the scope for crowding out (Andreoni, 1989, 1990). A third category, impure altruists, comprises individuals who earn utility both from own donations and total donations. It is often assumed that warm-glow givers are more prone to inefficient donations than pure altruists (Singer, 2015), a result that follows clearly from theory. However, to our knowledge, there is no direct empirical evidence supporting this claim. For example, Null (2011) separates donors into different types by assuming those who donate inefficiently must be warm-glow givers (after ruling out risk aversion). Similarly, Karlan and Wood (2017) report that some participants increase donations when presented with information on the effectiveness of donations, but many participants do not. They posit that the latter group may include warm-glow givers, but their experimental design does not allow them to ascertain this. Against this backdrop, our work provides an important contribution by demonstrating a direct link between giving type and cost-effective public good provision.

Individuals have a wide array of opportunities to contribute to public goods and charitable causes, so an implicit concern raised in all of the aforementioned studies is that donors may misallocate their resources toward less worthy causes. A number of recent studies have tackled this issue of multiple public goods directly. Much of the work in this realm focuses on coordination of donors across public goods. Corazzini et al. (2015) show that coordination problems can be eliminated by making one public good focal by offering better payoffs. Earlier work by Cherry and Dickinson (2008) similarly finds that subjects successfully coordinate on the option with highest social returns when faced with multiple public goods, even when the level of social returns is endogenous to aggregate contributions. Interestingly, when comparing a setting with multiple homogeneous public goods to an equivalent setting with a single public good, they report greater contributions in the former than the latter, suggesting the importance of framing. Bernasconi et al. (2009) also investigate this “unpacking” effect and similarly find improvements in contribution levels in the unpacked case. Similar to the above papers, Blackwell and McKee (2003) construct an environment with global and local public goods. They find that when the two goods provide equal societal returns, subjects donate more to the local public good; however, when social returns are higher for the global cause, subjects give more to the global public good—in spite of it generating smaller private returns.Footnote 2

We build on this tradition by implementing an experiment with multiple public goods, but we highlight a different aspect of the problem. Extending the design of Chan and Wolk (2020), we eliminate the scope for coordination problems, thus allowing sharper focus on individual allocations as the locus of inefficiency. In this way, our experiment more directly addresses issues raised by effective altruists, who express concerns about individuals’ willingness to allocate funds to inferior causes. Although our work bears similarities to Chan and Wolk (2020), it is distinct. Whereas Chan and Wolk (2020) provide a proof-of-concept for this experimental design and shed light on framing effects induced by the choice sets, we provide deeper insight into policy relevant determinants of cost-ineffectiveness. We show how the propensity for cost-(in)effective donations may be influenced by individual characteristics and the broader information environment.

Our findings on the provision of inferior public goods also shed light on the industrial organization of charities. In particular, how do less efficient and less impactful charities survive in the market? We find that warm-glow givers and impure altruists often spread their contributions across causes, including inferior ones that might not otherwise survive under effective altruism. This pattern of behavior can uphold otherwise unproductive charities, with accompanying implications for efficiency of charity markets and social welfare.

Incorporating uncertainty has been a topic of interest in the public good games literature. However, while a number of papers study risk in public good games [see, e.g., Dickinson (1998), Gangadharan and Nemes (2009), Freundt and Lange (2021) and Théroude and Zylbersztejn (2020)], few study ambiguity. Levati and Morone (2013) and Björk et al. (2016) are among the few to examine both risk and ambiguity, and they do not find significant differences in contribution behavior in situations involving risky, ambiguous, or deterministic MPCRs for single public good settings. According to the latter, there is also no interaction between strategic uncertainty and natural uncertainty. They find that cooperative attitudes and beliefs about group members’ contributions are unaffected by natural uncertainty. Even so, focus has remained on settings with single public goods, which does not allow for studying cost-effectiveness. Our work thus provides novel insights into the interplay between cost-effectiveness and the overarching information environment.

The remainder of the paper proceeds as follows: Section 2 provides a general model, lays out hypotheses for our public good game, and describes details of our experimental implementation. We discuss results in Sect. 3, while Sect. 4 concludes.

2 Experimental design and hypotheses

2.1 Design

We begin by laying out a general model of our game environment. Let there be n players and m public goods with prices normalized to unity. Following Chan and Wolk (2020), each player i has budget \(w^i_j>0\) to allocate to public good j: \(x^i_j\in [0,w^i_j]\). The marginal per capita return (MPCR) of public good j is denoted \(\gamma _j\). That is, player i’s contribution to public good j, \(x^i_j\), produces a benefit of \(\gamma _j\cdot x^i_j\) to all players \(k=1,\ldots ,n\). For each public good \(j=1,\ldots ,m\), we assume \(\gamma _j\in (0,1)\), such that not contributing to any of the public goods is the only rationalizable strategy for a selfish payoff maximizing player.

Departing from Chan and Wolk (2020), let the MPCRs be subject to risk: \(\gamma _j\sim F_j\). Let \({\widetilde{\gamma }}_j=\int _{\gamma _j}\gamma _j \rm{d}F_j\) and public goods be ordered such that \({\widetilde{\gamma }}_1<\cdots <{\widetilde{\gamma }}_m\). Finally, let the support of \(\gamma _j\) be \(({\underline{\gamma }}_j,{\overline{\gamma }}_j)\).Footnote 3

Defining utility over total payoffs, the expected utility of player i is given by

$$\begin{aligned} \int _{\gamma _1}\cdots \int _{\gamma _m} u^i\left( \,\sum _{j=1}^{m}\,\left[ \left( w^i_j-x^i_j\right) +\gamma _j\cdot \left( x^i_j+X^{-i}_j\right) \right] \,\right) \rm{d}F_m\cdots \rm{d}F_1, \end{aligned}$$

where \(X^{-i}_j=\sum _{k\ne i}x^k_j\). Rewriting this expression as

$$\begin{aligned} \int _{\gamma _1}\cdots \int _{\gamma _m} u^i\left( \sum _{j=1}^{m}\,\left[ \,w^i_j-\left( 1-\gamma _j\right) \cdot x^i_j+\gamma _j\cdot X^{-i}_j\right] \right) \rm{d}F_m\cdots \rm{d}F_1, \end{aligned}$$

we see that both the effective cost \(1-\gamma _j\) and the effective benefit \(\gamma _j\) are subject to risk.

If a player is risk-neutral, we derive the same condition as Chan and Wolk (2020) for the cost-effective allocation of resources. That is, if \(x^i_j>0\), then cost-effectiveness requires that \(x^i_{\ell }=w^i_{\ell }\) for all \(\ell >j\); otherwise, this player can increase her ex ante expected utility by shifting resources from \(x^i_j\) to \(x^i_{\ell }\). Clearly, this holds in an example where \(u^i(\pi )=\pi \). In this case, the utility expression from above can be simplified to:

$$\begin{aligned}{} & {} \int _{\gamma _1}\cdots \int _{\gamma _m} \sum _{j=1}^{m}\left[ \,w^i_j-\left( 1-\gamma _j\right) \cdot x^i_j+\gamma _j\cdot X^{-i}_j\right] \rm{d}F_m\cdots \rm{d}F_1 \\{} & {} \quad =\,\sum _{j=1}^{m}\left[ w^i_j-\left( 1-{\widetilde{\gamma }}_j\right) \cdot x^i_j+{\widetilde{\gamma }}_j\cdot X^{-i}_j\right] . \end{aligned}$$

Alternatively, if player i is risk-averse (e.g., \(u^i(\pi )=\pi ^{\alpha }\) with \(\alpha \in (0,1)\)), this is not obvious. The reason is that payoffs may depend on player i’s beliefs about the contributions of others (i.e., \(X^{-i}_j\) for all \(j=1,\ldots ,m\)). For instance, if i expects all others to contribute fully to public good m (\(x^k_m=w^k_m\)) and nothing to the other public goods (\(x^k_j=0\) for \(j<m\)), player i may decide to contribute a positive amount to public good \(m-1\) instead of m to mitigate risk. This possibility arises when \(\gamma _m\) and \(\gamma _{m-1}\) have overlapping supports, so that \(\gamma _{m-1}\) may be realized at a higher value than \(\gamma _m\).

We conduct our experiment with \(n=3\) players and \(m=4\) public goods. For each public good, players have an endowment of \(w^i_j=10\) points available that they can contribute to public good j. We implement three treatments with differing marginal per capita returns (MPCR), \(\gamma _j\), for each public good:

  • Treatment certain. The values of the four MPCRs are certain:

    $$\begin{aligned} \gamma _1=0.475 \quad \gamma _2=0.625 \quad \gamma _3=0.775 \quad \gamma _4=0.925. \end{aligned}$$

    Treatment risky. The values of the four MPCRs are subject to risk:

    $$\begin{aligned} \gamma _1\sim \text {Un}(0.40,0.55) \quad \gamma _2\sim \text {Un}(0.55,0.70) \quad \gamma _3\sim \text {Un}(0.70,0.85) \quad \gamma _4\sim \text {Un}(0.85,1.00). \end{aligned}$$
  • Treatment ambiguity. The values of the four MPCRs are subject to ambiguity:

    $$\begin{aligned} \gamma _1\in (0.40,0.55) \quad \gamma _2\in (0.55,0.70) \quad \gamma _3\in (0.70,0.85) \quad \gamma _4\in (0.85,1.00). \end{aligned}$$

    We specify disjoint supports in the latter two treatments (with \({\underline{\gamma }}_{\ell }>{\overline{\gamma }}_j\) for all \(\ell >j\)), which ensures that there is a clear ranking in the four MPCRs (\(\gamma _4>\gamma _3>\gamma _2>\gamma _1\)) in all three treatments. This ensures that, regardless of their risk attitude, cost-effectiveness of contribution behavior is well-defined in all three treatments: Individuals who contribute cost-effectively would not contribute to a public good with lower MPCR unless they exploited their full endowment in all public goods with higher MPCR. That is, \(x^i_j>0\) only if \(x^i_{\ell }=w^i_{\ell }\) for all \(\ell >j\).

2.2 Hypotheses

In our pre-registration, we log the following study objective: “Our primary outcome is whether uncertainty of consequences in public goods affect cost-effectiveness of individual contributions.” As our secondary outcome, we ask: “How does cost-effectiveness of contributions relate to individual attitudes toward risk and ambiguity, and giving type?”

Our primary null hypothesis is that players’ contributions are cost-effective in all treatments (regardless of their beliefs about the other players’ choices).

Hypothesis 1

Contributions will be cost-effective in all treatments.

For the secondary objective, we focus on two types of individual characteristics: risk and ambiguity attitudes and giving type. Because we have designed the treatments so that MPCR supports are disjoint, risk and ambiguity attitude should not yield differences in the cost-effectiveness of contributions.

Hypothesis 2

In all treatments, cost-effectiveness of contributions do not correlate with risk or ambiguity attitude.

We now consider the role of giving type. We consider four giving types as defined by Gangadharan et al. (2018) and Gandullia et al. (2020): non-donors, pure altruists, warm-glow givers, and impure altruists. The latter three types may contribute to public goods, but for different reasons. Above, we assumed that players’ utilities are defined over total payoffs to allow for more parsimonious exposition of risk types. Here, we consider richer preference structures to elucidate differences across giving types.

Pure altruists may contribute because they derive utility from the aggregate level of public good provision, e.g.,

$$\begin{aligned} u^i \left( \underbrace{\sum _{j=1}^m \left[ w_j^i - x_j^i\right] }_{\rm{numeraire}}, \underbrace{\sum _{j=1}^m \gamma _j \left[ x_j^i + X_j^{-i}\right] }_{\rm{public\; good}} \right) . \end{aligned}$$
(1)

Such behavior would also be consistent with a preference for efficiency, and there is evidence from prior public good experiments that (some) subjects behave in such a way (Goeree et al., 2002).

However, there is also evidence from public good experiments of warm-glow givers who instead derive utility from their own act of giving (Andreoni, 1993). These individuals may value contributions to each public good separately, e.g.,

$$\begin{aligned} u^i \left( \underbrace{\sum _{j=1}^m \left[ w_j^i - x_j^i\right] }_{\rm{numeraire}}, \underbrace{x_1^i,\ldots , x_m^i}_ {\text {contributions} \;\text {to \;each }\;m} \right) , \end{aligned}$$
(2)

or they may derive utility from the total amount they have contributed across all m public goods, e.g.,

$$\begin{aligned} u^i \left( \underbrace{\sum _{j=1}^m \left[ w_j^i - x_j^i\right] }_{\rm{numeraire}}, \underbrace{\sum _{j=1}^m x_j^i}_{\rm{total\; contributed}} \right) . \end{aligned}$$
(3)

Impure altruists combine both warm-glow and altruistic motives. Importantly, these differences in preference structures across giving types have implications for cost-(in)effective giving. A pure altruist has no reason to contribute cost-ineffectively to the public good, as doing so will reduce the total amount of the public good provided. On the other hand, a warm-glow giver may contribute cost-ineffectively. If there are diminishing marginal warm glow benefits for each specific public good in (2), then a warm-glow giver may spread their contributions across public goods in a cost-ineffective manner. Indeed, this is an underlying assumption of Null (2011). However, even a warm-glow giver who values total contributions over all public goods could act similarly. In this case, contributions to \(m-1\) and m are perfect substitutes in (3), and the individual will be indifferent between contributing to one cause or the other, which can beget a cost-ineffective allocation across causes.

As a result, we expect more cost-effective contributions among pure altruists and more cost-ineffective contributions among warm-glow givers, with impure altruists falling between these two poles. We summarize these insights in the following hypothesis:

Hypothesis 3

Pure altruists will contribute in a cost-effective manner. Impure altruists will contribute less cost-effectively than pure altruists, and warm-glow givers will contribute even less cost-effectively than impure altruists.

Because our experimental design directly manipulates uncertainty across treatments, we are best positioned to make causal statements on the first primary hypothesis. We do not experimentally manipulate risk and ambiguity attitudes or individual giving types, so our conclusions on the last two secondary hypotheses will be based on correlational evidence.

2.3 Procedures

On Tuesday 25 August 2020 and Monday 24 April 2023, we invited up to, respectively, 216 and 108 potential participants (18–65 years, fluent in English) via Prolific to participate in a “decision-making experiment”.Footnote 4 In total, respectively, 201 and 107 participants both accepted the consent form and completed all tasks.Footnote 5 All these participants visited all three treatments and completed four public good tasks in each treatment, but the order in which they visited the three treatments was subject to individual randomization. After completing the experimental treatments, subjects completed a post-experiment questionnaire that comprised a sequence of short individual tasks in which we elicited gender, age, giving type (Gangadharan et al., 2018; Gandullia et al., 2020), risk attitude (Eckel & Grossman, 2002, 2008), ambiguity attitude (Baillon et al., 2018), attention level (Frederick, 2005; Sirota and Juanchich, 2018), and risk literacy (Cokely et al., 2012). We offer brief descriptions of these tasks in the next section and provide full details in Appendix A.

Groups were formed as soon as a triple of participants completed the full suite of tasks.Footnote 6 Each treatment was completed as a one-shot game.Footnote 7 The three players within a group were paid according to the same randomly drawn treatment (although players may have visited those treatments in a different sequence) and for each public good according to the same (randomly drawn) MPCR, and received feedback on their final earnings at the end of the experiment.Footnote 8 All this was common knowledge to the participants. On average, participants earned \(\pounds 4.08\) in variable payment, in addition to the participation fee (which was \(\pounds 4.00\) in the first session and \(\pounds 5.00\) in the second session) for on average 23 min and 27 s of their time. Subjects were informed of their earnings for all tasks at the end of the experiment; importantly, this means they were unaware of their earnings from the public goods game when they completed the giving-type elicitation.Footnote 9

Prior to data collection, ethical approval was obtained from Vrije University School of Business and Economics Research Ethics Review Board (reference code SBE6/9/2020kwk350) and University of Otago’s Human Ethics Committee (reference code D20/183). Moreover, this study was pre-registered in advance on 18 August 2020 in the AEA RCT Registry under the unique identifying number “AEARCTR-0006304”. The experiment was programmed using oTree (Chen et al., 2016). Screenshots are available in Appendix C.

One advantage of the online setting is that we can track reading times for each page in the study. We find substantive engagement throughout the study. On average, subjects spent 5 min and 46 s on the instructions for our main experimental task and an additional 3 min and 24 s on the tasks themselves. There were also no noticeable drops in engagement over the course of the study. The residence times on each page of our post-experimental questionnaire suggest reflective engagement with each, with longer durations for more complicated questions. For example, subjects spent the most time on our ambiguity attitude questions (4 min and 22 s, on average) and risk literacy questions (2 min and 4 s, on average), the latter of which was the final task in the study. In total, subjects spent over 23 min and 27 s, on average, on the study. Together, these data speak to the attentiveness of our sample. Furthermore, in line with best practices for online experiments, we required subjects to complete several control questions as comprehension checks before proceeding to the main experimental tasks. The comprehension checks for the public goods games were set up as four simultaneous questions presented on a single screen: Three true/false questions and one calculation question (see Appendix C for screenshots of the questions). Participants had one chance to get the questions right. If a participant answered a question incorrectly, we showed an explanation of the correct answer after which they needed to revise their answer to the correct one before proceeding. The percentages of participants who correctly answered the comprehension checks are 76.6% (Q1, true/false), 79.6% (Q2, true/false), 28.6% (Q3, true/false), and 47.7% (Q4, open answer, correct answer is both the mode and median). With the exception of question 3 which was worded in a way that unless a participant paid very close attention there is a chance to answer it incorrectly, participants appear to comprehend the rules of the game well.Footnote 10

3 Results

3.1 Descriptive statistics

Our post-experimental questionnaire included survey questions for basic demographic variables (gender and age) and a series of tasks to elicit each participant’s giving type, risk attitude, ambiguity attitude, risk literacy, and attentiveness. We summarize these tasks here briefly and provide additional details on each in Appendix A.

For giving type, we use a two-stage charitable giving task, following the design of Gangadharan et al. (2018) as implemented by Gandullia et al. (2020). Based on the donations made in each stage of this task, we classify each participant as a non-donor, a pure warm-glow donor, a pure altruist, or an impure altruist. To elicit risk attitudes, we use the lottery menu described by Eckel and Grossman (2002, 2008), and we use the method of Baillon et al. (2018) to elicit ambiguity attitudes. We normalize both scales so that they range from 0 to 1, with low values indicating risk/ambiguity-aversion and high values indicating risk/ambiguity loving attitudes. Attentiveness was measured with a cognitive reflection test (CRT) (Frederick, 2005) using the multiple choice format described by Sirota and Juanchich (2018). We use the four questions from the Berlin Numeracy Test of Cokely et al. (2012) to measure risk literacy. Both attentiveness and risk literacy are encoded as the fraction of correct answers given to these survey questions.

Table 1 presents participant characteristics based on the post-experimental questionnaire. We show statistics for each treatment sequence (as described above) and for the full sample; we use the following abbreviations: Certain (C), Risky (R), and Ambiguity (A). Kruskal–Wallis tests do not indicate significant differences in participants’ gender, age, total donations in the giving-type task, risk attitude, ambiguity attitude, attention level, and risk literacy across the orders (\(p>.33\)).Footnote 11

Table 1 Participant characteristics

We now turn to participant behavior in the experiment. Table 2 presents average contribution levels across the four different public goods, by treatment environment and by sequence. Kruskal–Wallis tests do not identify any order effects in any of the three treatments for participants’ contributions to individual public goods or in participants’ aggregate contributions over public goods (\(p>.31\)).

Table 2 Summary statistics for contribution levels

In terms of times required for participants to fill out the decision screens, average completion times are quite similar across treatments (63 s for Certain, 68 s for Risk and 73 s for Ambiguity). However, when further splitting each treatment by order, a Kruskal–Wallis test identifies highly significant differences for the three treatments across orders (\(p=.0001\)). Further inspection reveals that the observed difference arises because participants progressively decreased completion times over the course of the experiment, likely due to the familiarity with the rules of the game that varied only along a single dimension across the three screens. On average, the first task required 116 s to complete, the second task required 51 s to complete, and the third task required 37 s to complete. Kruskal–Wallis tests do not reveal significant differences across orders for the time taken to complete the first, second and third task (\(p>.11\)).

Overall, these various analyses give us confidence that there are no order effects. Thus, to generate our primary results, we pool the data and conduct a within-subjects analysis. However, for the sake of transparency and robustness, we replicate all major analyses using a between-subjects approach in Appendix B.

3.2 Contribution behavior and cost-effectiveness

We begin by presenting results on basic contribution behavior to foreground subsequent results on cost-(in)effectiveness. For clarity, we refer to the Wilcoxon signed-rank test as the Wilcoxon test and the Mann–Whitney U (or Mann–Whitney–Wilcoxon) rank-sum test as the Mann–Whitney test throughout the paper.

Table 3 shows, for each of the three treatments, the participants’ average contributions to each of the four public goods and aggregated over the four public goods. Wilcoxon tests comparing contributions between consecutive public goods reveal that in all three treatments subjects are responsive to the MPCR (\(p=.0000\)), with average contributions increasing in MPCR.

Table 3 Contributions across treatments

Importantly, participants’ aggregate contributions over the four public goods do not vary significantly across treatments, as can be seen in Fig. 1, which shows the cumulative distributions over these total contributions for each of the treatments. Given the almost perfect overlap of the three curves, it is no surprise that neither a Kruskal–Wallis test for mean equivalence by treatment (\(p=.8476\)) nor Wilcoxon tests reveal significant differences in total contributions across treatments (\(p>.48\)). In terms of contributions to individual public goods, Kruskal–Wallis tests do not return any significant differences between treatments (\(p>.63\)).

Fig. 1
figure 1

Total contributions

We now move to study our main research question: Whether cost-effectiveness differs across treatments. We begin by defining cost-ineffectiveness as

$$\begin{aligned} \mathop {\rm{CI}}\left( x^i\right) = \sum _{j=2}^m \min \left\{ \sum _{\ell =1}^{j-1}x^i_{\ell } ; \sum _{\ell =j}^m\,\left( w^i_{\ell }-x^i_{\ell }\right) \right\} , \end{aligned}$$

where \(w^i_{\ell }=10\) and \(m=4\) in our experiment. The minimum value for \(\rm{CI}(x^i)\) is 0 (no ineffectiveness) and the maximum is 40, obtained with the allocation (10, 10, 0, 0).

Notably, the maximal attainable \(\mathop {\rm{CI}}\) varies with the total contribution amount. For example, an individual who contributes only 1 unit could have \(\mathop {\rm{CI}}=3\) if she contributes that one unit to PG1. If she contributed 2 units, and both to PG1, should would instead have \(\mathop {\rm{CI}}=6\). Because total contributions are endogenous, this property jeopardizes between-subject comparisons. To correct for this, we develop a measure of relative cost-ineffectiveness.

Take an individual who contributes \(X^i=\sum _{j=1}^mx^i_j\) in total. The maximum cost-ineffectiveness is obtained by this amount being contributed to the least effective public goods. This is the allocation \(y^i(X^i)\) defined as

$$ y_{j}^{i} \left( {X^{i} } \right) = \max \left\{ {\min \left\{ {X^{i} - \sum\limits_{{\ell = 1}}^{{j - 1}} {w_{\ell }^{i} ;w_{j}^{i} } } \right\};0} \right\}. $$

Now, we can define relative cost-ineffectiveness as

$$ {\text{RCI}}\left( {x^{i} } \right) = {\text{CI}}\left( {x^{i} } \right){\mkern 1mu} /{\text{CI}}\left( {y^{i} \left( {\sum\limits_{{j = 1}}^{m} {x_{j}^{i} } } \right)} \right). $$

This measure is attractive because it conditions on the total amount contributed, thus allowing for comparison across subjects.

To provide intuition on the process of calculating \(\mathop {\rm{CI}}\) and \(\mathop {\rm{RCI}}\), we offer an example in Fig. 2. The individual in the left panel of this figure contributes 2 units to PG1, 5 units to PG2, 7 units to PG3 and 9 units to PG4. This individual contributes in total 23 units to the four public goods. If this individual would have been providing these 23 units in the most cost-effective manner, she would have contributed according to the contribution profile in the middle. Starting from the left panel, achieving the cost-effective contribution profile would require moving \(\mathop {\rm{CI}}=7\) units, as shown in the figure. The right panel shows the contribution profile if she were to contribute those 23 units in the most cost-ineffective manner. To achieve a cost-effective contribution profile from the right panel would require a movement of \(\mathop {\rm{CI}}=37\) units. Thus, the relative cost-ineffectiveness of the scheme provided by the individual is \(\mathop {\rm{RCI}}=7/37\approx 0.1892\).

Fig. 2
figure 2

Example for \(\mathop {\rm{RCI}}\)

As noted above, a key advantage of \(\mathop {\rm{RCI}}\) is that the measure is normalized, allowing for comparison across subjects with different total contributions. On the other hand, one challenge with \(\mathop {\rm{RCI}}\) is that it is undefined for individuals who cannot contribute cost-ineffectively, i.e., non-contributors and full contributors.Footnote 12 However, practically speaking, this does not pose any problems for our analysis. We find that total contributions are quite similar across treatments (see distributions in Fig. 1), indicating that total contribution behavior is not a source of significant confounding variation in \(\mathop {\rm{RCI}}\).

Figure 3 plots the cumulative distributions of \(\mathop {\rm{RCI}}\) for the three treatments. Clearly, cost-ineffective giving is prevalent, indicating that subjects are not maximizing the impact and payoffs generated by their total contributions, which is at odds with Hypothesis 1. However, the prevalence of cost-ineffectiveness is similar across treatments. Given that the three curves in the figure are nearly identical, it is no surprise that neither a Kruskal–Wallis test (\(p=.8100\)) nor Wilcoxon tests show significant differences across treatments (C vs. R: \(p=.1965\); C vs. A: \(p=.2823\); R vs. A: \(p=.8867\)). Thus, the information environment appears to have little bearing on the cost-effectiveness of public good contributions.

Fig. 3
figure 3

Relative cost-ineffectiveness

3.3 Individual characteristics

We next study how individual characteristics affect the cost-effectiveness of individual contributions which is relevant for Hypotheses 2 and 3. Table 4 presents regression results relating \(\mathop {\rm{RCI}}\) to individual characteristics in each treatment (columns 1–3) and based on the average \(\mathop {\rm{RCI}}\) across treatments (column 4). In our discussion of these results, we will evaluate differences at 5% significance, although the table includes markers for significance at the 1% and 10% levels, as well.

Table 4 Relative cost-ineffectiveness

With respect to Hypothesis 2, we do not find that ambiguity attitude has any affect on \(\mathop {\rm{RCI}}\) in any of the treatments. Likewise, we find little evidence that risk attitude affects \(\mathop {\rm{RCI}}\), except in the Ambiguity treatment.

Interestingly, related to Hypothesis 3, we find in all treatments that pure warm-glow givers, pure altruists, and impure altruists are more cost-ineffective compared those who did not donate in the giving-type task (non-donors); however, this effect is only significant in the Ambiguity treatment for donors who are at least in part motivated by warm-glow (pure warm-glow givers and impure altruists). Note that these regression-based results rely upon comparisons against the omitted category (non-donors), which may not be the comparisons of primary interest. Kruskal–Wallis tests find differences in \(\mathop {\rm{RCI}}\) across giving types in both Risky (\(p=.0381\)) and Ambiguity (\(p=.0153\)), but not in Certain (\(p=.4056\)). For Risky and Ambiguity, Dunn’s tests indicate a significant difference between impure altruists and pure altruists, between impure altruists and non-donors, and between pure warm-glow givers and non-donors. Figure 4 shows the distribution of the average \(\mathop {\rm{RCI}}\) for each giving type. The average \(\mathop {\rm{RCI}}\) is 0.2525 for non-donors, 0.3369 for warm-glow givers, 0.2905 for pure altruists, and 0.3494 for impure altruists. A Kruskal–Wallis test identifies a significant difference in the average \(\mathop {\rm{RCI}}\) across giving types (\(p=.0425\)). Subsequent Dunn’s tests find the average \(\mathop {\rm{RCI}}\) to be significantly different between warm-glow givers and non-donors (\(p=.0219\)) and between impure altruists and non-donors (\(p=.0060\)); the difference between impure altruists and pure altruists is only significant at the 10% level (\(p=.0513\)) and no significant difference is found for the other three pairwise comparisons (\(p>.13\)). These results suggest that types who are at least partially motivated by warm glow (pure warm-glow givers and impure altruists) are contributing less cost-effectively than those who are not (non-donors and pure altruists). This result can be further corroborated with a Mann–Whitney test comparing the average \(\mathop {\rm{RCI}}\) between these two groups of types (\(p=.0083\)). We note that these differences in \(\mathop {\rm{RCI}}\) across giving types are not caused by differences in total contributions. Kruskal–Wallis tests give \(p=.6131\) on total contributions averaged over the three treatments and \(p>.29\) for each treatment separately. Also, average total contributions are not different between the two groups of types (Mann–Whitney: \(p=.2057\)).

Fig. 4
figure 4

Average relative cost-ineffectiveness and giving type

Regarding other individual characteristics, we find some gender differences, with females contributing in a less cost-effective manner compared to males in Risky and Ambiguity. Second, participants with a lower attention level, as measured via the cognitive reflection test, are found to be less cost-effective in all three treatments. Third, risk literacy shows no significant effect.

The between-subjects analysis largely confirms findings of the within-subjects analysis. For Hypothesis 1, we observe a slightly larger dispersion for the \(\mathop {\rm{RCI}}\) between treatments, yet this difference is not significant, in line with the within-subjects analysis. For Hypothesis 2, we do not find significant coefficients for risk or ambiguity attitudes, except in one case: the regression coefficient for ambiguity attitude is positive and significant at the 5% level in Risky. However, if we were to use a more stringent threshold that corrects for multiple hypothesis testing, this effect would be insignificant. With respect to giving types and differences in cost-effectiveness (Hypothesis 3), we again see a similar pattern to that of the within-subjects analysis, but the p-value is larger and now stands at .0773 from a Kruskal–Wallis test. Further detail is available on the between-subjects analysis in Appendix B.

4 Discussion and conclusion

We set out to investigate how cost-effectiveness varies across information environments and by individual characteristics. We find no evidence that the information environment affects overall contributions or the cost-effectiveness of contributions. In terms of individual characteristics, we find that risk and ambiguity attitudes have little bearing on the cost-effectiveness of contributions. Meanwhile, giving type does influence cost-effectiveness, with individuals who are at least partially motivated by warm glow (i.e., pure warm-glow givers and impure altruists) contributing less cost-effectively than those who are not motivated by warm glow (i.e., pure altruists and non-donors).

This latter finding can be rationalized with theory and is consistent with common wisdom concerning warm-glow motives. Indeed, previous literature assumes warm-glow givers are more likely to donate inefficiently than are pure altruists [see, e.g., Null (2011) and Singer (2015)], yet we are not aware of any prior work that directly tests this relationship. Against this backdrop, our results are novel and provide important empirical backing for this commonly held claim. In addition to documenting differences in cost-effectiveness across giving types, we furthermore show that these differences are most pronounced in settings where there may be risk or ambiguity surrounding the value of public good contributions.

More broadly, our experimental work elucidates key factors underlying—or undermining—effective altruism. We shed new light on how the cost-effectiveness of public good provision efforts is influenced by individual characteristics in general and giving types in particular. Our inquiry into environments with multiple public goods is especially germane given expanding options for public good provision through traditional charities and emerging crowdfunding and patronage platforms.

One point about our giving type results merits further discussion. Although we find pure altruists have a lower \(\mathop {\rm{RCI}}\) than warm-glow givers and impure altruists, it may seem surprising that pure altruists display any \(\mathop {\rm{RCI}}\) at all. Theoretically, a pure altruist should consider their contributions and others’ contributions as perfect substitutes and care about efficiency as they derive utility from the total amount of the public good (or charitable good) provided. Methods for eliciting giving type focus on only one of these two features of pure altruism. Those relying only on perfect substitutability include Crumpler and Grossman (2008), Gangadharan et al. (2018), Ottoni-Wilhelm et al. (2017), and Fielding et al. (2022). Gangadharan et al. (2023) propose a different elicitation method that classifies subjects as pure altruists if their decisions in an experiment show they care about the effectiveness of their donation. We find that some subjects who are classified as pure altruists based on the Gangadharan et al. (2018) elicitation do contribute in a cost-ineffective manner; thus, a subject could be classified as a pure altruist by one of Gangadharan et al.’s classification systems, but not the other.

We can imagine several avenues for future research. First, our theoretical exposition suggests that those with warm-glow motives are more likely to contribute in a cost-ineffective manner. Indeed, we find evidence of this in our experiment, but it is possible that there are other factors that correlate with giving type that drive these between-treatment differences. For example, different giving types may also have different norms or beliefs about how one should allocate resources across different charitable causes. Exploring these possibilities will provide a better understanding of whether giving type, per se, is responsible for our experimental results. A second interesting line of inquiry is to vary whom the individual interacts with in the different public goods. In our experiment, a subject interacts with the same set of group members across all four public goods, but in real-world settings, it is more likely that they will interact with different individuals or networks in each (Bramoullé and Kranton, 2007; Richefort, 2018). Third, there is a need for further inquiry into how giving types are characterized and whether these differ across contexts. In our experiment, we have elicited giving types using a charitable giving task, and we have found that these giving types are correlated with different patterns of behavior in a public good game. However, it is not obvious that giving types will necessarily be consistent across these settings Chan et al. (2023).