1 Introduction

Experiments on decision making under risk mostly employ a coalesced presentation of lotteries, i.e., branches which lead to the same consequences are combined and the respective probabilities are added up. However, presenting gamble pairs in a canonical split form makes them easier to compare and process for the decision maker since, in the case of binary choice, both gambles involve the same set of probabilities. For illustration, consider the classic paradox of Allais (1953), also termed common consequence effect, where M$ denotes millions of dollars. Figure 1 presents the Allais paradox in the commonly used coalesced form. Here, subjects tend to choose option A in Choice 1 and option B’ in Choice 2, which constitutes a violation of expected utility.

Fig. 1
figure 1

Common consequence effect in a coalesced form

Birnbaum (2004) showed that violations of expected utility (EU) in the common ratio effect can be substantially reduced if the gamble pairs are presented in their canonical split form as depicted in Fig. 2. In the canonical split form, which is a commonly known way of splitting, both lotteries are split such that there are equal probabilities on corresponding ranked branches and the numbers of branches are equal in both gambles and minimal. The presentation in Fig. 2 makes it more transparent that both gambles in each choice have an 89% chance of a common outcome, which should be ignored when determining the preferred option under EU.

Fig. 2
figure 2

Reduced common consequence effect in a split form

Also other typical violations of EU, like the common ratio effect or violations of transitivity, are less frequently observed for split than for coalesced presentation of gambles (Humphrey 2001; Schmidt and Seidl 2014; Birnbaum et al. 2017). The fact that coalesced and split presentation of gambles can lead to systematically different choice behavior has already been discussed by Starmer and Sugden (1993) and Humphrey (1995) under the term event-splitting effects. Nevertheless, the implications of these effects have remained largely unexplored in the economics and management literature. The present paper focuses on two questions in this context.

Question 1 is devoted to the shape of the probability weighting function. Many descriptive alternatives to the EU, like prospect theory (Kahneman and Tversky 1979; Tversky and Kahneman 1992) or rank-dependent utility (Quiggin 1982), integrate a non-linear distortion of probabilities formalized by a probability weighting function in their utility representation. Nowadays, the majority consensus in the literature is that this function is typically inverse-S shaped, i.e., small probabilities are overweighted, whereas large ones are underweighted (Wu and Gonzalez 1996; Gonzalez and Wu 1999; Abdellaoui 2000; Bleichrodt et al. 2001; although not unanimously, see, e.g., Hertwig 2012 for critique). The evidence has mainly been derived using coalesced presentation of the lotteries. We analyze how the shape of the probability weighting function differs if the gamble pairs are presented in a split form instead. Since probabilities are easier to compare in split form, an absence or diminished extent of the typical non-linear shape in this form could indicate that the previous evidence mostly reflects difficulties in processing probabilities instead of an ingrained non-linear weighting of probabilities.

Question 2 is related but more general. As violations of EU decrease under the split form, it could be more suitable for elicitation of von Neumann–Morgenstern utility functions in EU. The split form could thus improve prescriptive decision analysis, as the assessment of von Neumann–Morgenstern functions is central in this context (von Winterfeldt and Edwards 1986; McCord and de Neufville 1986; Fischoff 1991; Bleichrodt et al. 2001).

Our analyses of both questions are based on parametric analysis with fitting the parameters of EU and rank-dependent utility (RDU)—with the latter corresponding to the gain-domain parameters of cumulative prospect theory (CPT). Firstly, to answer Question 2, we ask whether the fit of EU improves if we use choice data from split lotteries. We then extend this analysis from the EU to the RDU. To keep the analysis manageable, we restrict our attention to pure gain gambles.

Up to date, only limited work has been done to examine what intelligible impact, if any, failing to account for splitting effects shows in the RDU framework. Indeed, quite little is known about the impact of the splitting effects on the estimated values of the central parameters of RDU or the model fit. Real-life gambles do not always occur in a split form, but explicitly presenting them in a split form could improve the fit of RDU. Moreover, it could alter the features of the probability weighting function, namely, the magnitude of the usually observed non-linear shape, which brings us back to Question 1. And this is indeed a result that we find: The split form significantly improves prescriptive decision analysis.

The paper is organized as follows. We discuss the theoretical background and related literature in Sect. 2. We lay out the experimental design and estimation approaches in Sect. 3 and present the results in Sect. 4. Finally, we discuss the limitations and implications for future research in Sect. 5 and 6.

2 Background

2.1 Expected utility, rank-dependent utility and cumulative prospect theory

We consider a set of real-valued outcomes X. \({\mathcal{P}}\) denotes the set of all gambles or lotteries over X. A gamble \(P \in {\mathcal{P}}\) satisfies the axioms of Kolmogoroff (1933), i.e., \(0 \le p\left( {x_{i} } \right) \le 1 \,\forall\, x_{i} \in X\) and \(p(X) = 1\). Preferences of the decision are formalized by a binary relation ≽ \(\subseteq {\mathcal{P}} \times {\mathcal{P}}\). A function \(V:X \to \nabla\) represents ≽ on \({\mathcal{P}}\) if and only if P\(Q \Leftrightarrow V\left( P \right) \ge V\left( Q \right)\). For a gamble with n possible outcomes, the preferences in EU can be represented by

$$V(P) = \mathop \sum \limits_{i = 1}^{n} u(x_{i} )p(x_{i} ),$$
(1)

where u is the von Neumann–Morgenstern utility function. In parametric analysis, u is commonly assumed to be a power function \(u\left( {x_{i} } \right) = x_{i}^{\alpha }\), where \(1 - \alpha\) is the coefficient of relative risk aversion.

In the CPT framework, outcomes of a gamble are ordered in an increasing order \(x_{1} \le \cdots \le x_{k} \le 0 \le x_{k + 1} \le \cdots \le x_{n}\) and preferences can be represented by a sum of two RDU functionals

$$V\left( P \right) = \mathop \sum \limits_{i = 1}^{k} \pi_{i}^{ - } v\left( {x_{i} } \right) + \mathop \sum \limits_{j = k + 1}^{n} \pi_{j}^{ + } v\left( {x_{j} } \right),$$
(2)

where sign dependence, reference dependence and rank dependence are all satisfied (Tversky and Kahneman 1992). In parametric analysis, the value function is mostly taken as a two-part power function:Footnote 1

$$v\left( x \right) = \left\{ {\begin{array}{*{20}l} {x^{\alpha } } \hfill & {x \ge 0} \hfill \\ { - \lambda \left( { - x} \right)^{\beta } } \hfill & {x < 0} \hfill \\ \end{array} } \right..$$
(3)

It assigns a number v (x) to each outcome x to describe the subjective value of the outcome relative to a reference point. The reference-dependent S-shaped value function with v (0) = 0 firstly exhibits diminishing sensitivity to gains and losses, such that the function is concave (with \(0 < \alpha < 1\) exhibiting risk aversion for gains) or convex (with \(0 < \beta < 1\) exhibiting risk seeking behavior for losses), respectively. Secondly, the value function implies loss aversion (when \(\lambda > 1\)) in that a loss of a given amount has more impact on the attractiveness of a prospect than a gain of an equivalent amount: \(- v\left( { - x} \right) > v\left( x \right)\) for all x > 0 (Kahneman and Tversky 1979, Tversky and Kahneman 1991). Note, however, that we focus exclusively on the gain domain in our analyses.

The decision weights \(\pi^{ + }\) for the cumulative probabilities of positive outcomes in (1) are defined by

$$\pi_{n}^{ + } = w^{ + } \left( {p_{n} } \right),$$
(4)

with

$$\pi_{j}^{ + } = w^{ + } \left( {p_{j} + \cdots + p_{n} } \right) - w^{ + } \left( {p_{j + 1} + \cdots + p_{n} } \right),\, k < j < n.$$
(5)

The probability weighting function \(w^{ + }\) is strictly increasing and continuous. It is defined for the whole probability domain \(\left[ {0, 1} \right]\) and satisfies \(w^{ + } \left( 0 \right) = 0\) and \(w^{ + } \left( 1 \right) = 1\).

In their work, Tversky and Kahneman (1992) propose fitting the data to the following single-parameter functional form of the probability weighting function:

$$w^{ + } \left( p \right) = \frac{{p^{{\gamma^{ + } }} }}{{\left( {p^{{\gamma^{ + } }} + \left( {1 - p} \right)^{{\gamma^{ + } }} } \right)^{{1/\gamma^{ + } }} }}.$$
(6)

The typically observed inverse-S shape of this weighting function (henceforth TKW) exhibits overweighting of small probabilities (up to the crossover point where w (p) = p) and underweighting of large probabilities.

Furthermore, to contribute to the empirical tractability of the model, we consider additional parametric specifications of the probability weighting function. For example, the two-parameter linear-in-log-odds specification introduced by Goldstein and Einhorn (1987) has been claimed to be the most commonly used specification of the probability weighting function (Booij et al. 2010). This function (henceforth GEW) is given by

$$w\left( p \right) = \frac{{\delta p^{\gamma } }}{{\delta p^{\gamma } + \left( {1 - p} \right)^{\gamma } }},$$
(7)

where two parameters independently, instead of one, explain the shape of the weighting function. Namely, the γ parameter (which is usually assumed \(0 < \gamma < 1\) to maintain the inverse-S shape) allows controlling for the curvature of the function and thus serves as a discriminability indicator, while the δ parameterFootnote 2 (δ > 0) explains the elevation of the function and thus serves as an attractiveness indicator (Tversky and Kahneman 1992; Gonzalez and Wu 1999). Given that the two properties often do not covary, this specification can offer a substantial advancement in relation to the Tversky and Kahneman (1992) specification outlined above (Gonzalez and Wu 1999; Booij et al. 2010). We should note, however, that the exact values of the named weighting parameter estimates are subject to possible interaction effects,Footnote 3 e.g., between δ and α, which both take into account some part of risk aversion (Nillson et al. 2011; Glöckner and Pachur 2012).

Another two-parameter specification enabling similar interpretation of the γ and δ parameters as GEW was introduced by Prelec (1998) (henceforth P2W):

$$w\left( p \right) = {\text{e}}^{{ - \delta ( - \ln \left( p \right))^{\gamma } }} .$$
(8)

A special case of the P2W with \(\delta = 1\) is expressed in the single-parameter form (henceforth P1W):

$$w\left( p \right) = {\text{e}}^{{ - ( - \ln \left( p \right))^{\gamma } }} .$$
(9)

It can reportedly outperform the other weighting functions presented above if used in combination with the power value function in (3) (Stott 2006).

Previous studies have delivered mixed results regarding which of the four functional forms of the probability weighting function provides the best fit (see, e.g., Wu and Gonzalez 1996; Gonzalez and Wu 1999; Sneddon and Luce 2001). In this paper, we thus consider the power value function in combination with all four functional forms of the weighting function presented above: the one-parameter TKW and P1W and the two-parameter GEW and P2W. In addition, we introduce a fifth weighting function attributable to the EU: w (p) = p (restricting the γ parameter to the value of unity such that no weighting taking place, henceforth denoted EUW) for benchmark purposes as a special linear case of the one-parameter weighting function.

2.2 Event-splitting effects

By definition, an event-splitting effect (also called violation of coalescing) occurs when a reversal of preference arises in response to a coalesced- versus split-form change of the same choice. By using the term split form of a gamble throughout the paper, we refer to the canonical split form, for which both gambles of a choice are split in a way that allows the corresponding ranked branches to have equal probabilities while keeping the number of branches minimal (also implying the same number of branches in both gambles).Footnote 4 For example, a choice between A = (€40, 0.1; €40, 0.1; €2, 0.8) and A′ = (€98, 0.1; €2, 0.1; €2, 0.8) is called the canonical split form of the choice between the coalesced B = (€40, 0.2; €2, 0.8) and B′ = (€98, 0.1; €2, 0.9) (Birnbaum and Navarrete 1998).

Kahneman (2003) suggests that “most decision makers will spontaneously transform the former prospect into the latter and treat them as equivalent in subsequent operations of evaluation and choice” (p. 727). This observation largely coincides with the definition of coalescing, which refers to an assumption that any two or more branches leading to the same outcome can be combined by adding their probabilities without affecting the utility of the gamble, such that \(A = \left( {x, p;x, q;y, 1 - p - q} \right) \sim B = \left( {x, p + q;y, 1 - p - q} \right)\) and additionally \(C = \left( {x, p;y, q;y, 1 - p - q} \right) \sim D = \left( {x,p;y, 1 - p} \right)\). Coalescing implies that \(A \succ C\) if and only if \(B \succ D\). From above, \(A \sim B\) and \(C \sim D\). Therefore, \(A \succ C\) if and only if \(B \sim A \succ C \sim D\). Thus, by transitivity, \(B \succ D\) (Birnbaum et al. 2017). Because coalescing and transitivity should be satisfied within the CPT framework with any w(p) function, there should also be no splitting effects (Birnbaum and Navarrete 1998; Luce 1998; Birnbaum 2008; see Appendix 1 for a proof illustrating how the CPT’s assumption of rank dependency implies the satisfaction of coalescing).

In the meantime, there exists abundant evidence showing that people do not treat coalesced-form and split-form gambles as equal (e.g., Conlisk 1989; Starmer and Sugden 1993; Humphrey 1995). In fact, it has been shown that splitting the branch with the highest available outcome can increase the attractiveness of a gamble in comparison to a coalesced form of the same gamble. Conversely, splitting the branch with the lowest available outcome decreases the attractiveness of the gamble (Starmer and Sugden 1993; Humphrey 1995, 2001). Splitting both the highest and lowest branches in a binary gamble with two equiprobable positive branches tends to make the gamble worse, in compliance with loss aversion (Birnbaum 2008).

Interestingly, it appears that violations of coalescing cannot be attributed to lack of knowledge, as they are persistent even in people with doctoral degrees who are familiar with the literature on decision making (Birnbaum 1999). Neither can the splitting effects be explained by errors, as they are still persistent when errors are factored out (Birnbaum et al. 2017), nor can the effects be attributed to the particular formatFootnote 5 used for presenting, or framing, the gambles (Birnbaum 2004, 2006; Birnbaum et al. 2008). Decision heuristics, like anchoring and adjustment, cannot account for the observed splitting effects either (Humphrey 1996). Meanwhile, the results regarding the effects of specific learning and experience are still mixed (see, e.g., Humphrey 2006; Birnbaum and Schmidt 2015).

Indeed, it seems that people simply “do not obey coalescing” (Birnbaum 2007, p. 171). The assumption that coalesced and split forms of the same gamble would be treated equivalently is thus “empirically false” (Birnbaum 2008, p. 464). And yet, CPT is still argued by many to be “the “best”, if imperfect, description of decision making under risk and uncertainty” (Birnbaum 2008, p. 463). In light of this controversy, a question arises regarding what intelligible impact, if any, the splitting effects have on the conclusions drawn by the “imperfect” CPT framework.

We examine which of the two gamble presentation forms (coalesced versus split) leads to more normatively accurate, or rational, results. In this context, the term rationality is used in the sense proposed by von Neumann and Morgenstern (1947). It implies being in line with the normative preference axioms of the EU and, most notably, the substitution axiom.

We hypothesize that the normative EU explains data comprised of split-form gambles better than data comprised of coalesced-form gambles, while it is unclear whether this is to be the case for the descriptive RDU model.Footnote 6 A couple of previous studies have already focused on the CPT’s fit to data that indirectly test coalescing (e.g., Birnbaum and Chavez 1997; Birnbaum and Navarrete 1998), but little is known about how well selected data comprised of coalesced- versus split-form gamble pairs fit an RDU model with varying functional specifications.

3 Methodology

3.1 Experimental procedure

We examine these questions using data from a pairwise choice experiment (see Birnbaum et al. 2017), conducted with 54 student subjects at the University of Kiel in Germany (all undergraduate students, 61% in the economics and business administration programs; 22.0 years old on average; of them, 21 female). The experiment is based on a random-lottery incentive mechanism, which is a commonly used one-step choice-based elicitation approach that lets subjects face multiple pairs of gambles in a sequence and choose a preferred gamble for each of the pairs (see, e.g., Hey and Orme 1994; Wu and Gonzalez 1996).

At the end of the experiment, one pair is chosen randomly and played out for real. Each pair consists of a risky gamble \(R = \left( {x_{1} ,p_{1} ;x_{2} ,p_{2} ;x_{3} ,p_{3} ;x_{4} ,p_{4} } \right)\) and a safe gamble \(S = \left( {y_{1} ,q_{1} ;y_{2} ,q_{2} ;y_{3} ,q_{3} ;y_{4} ,q_{4} } \right)\) with two to four outcomes \(x_{i}\), \(y_{i}\) and respective probabilities \(p_{i}\), \(q_{i}\) that are systematically varied. The choices between the gambles are presented in a pseudo-random order and the outcomes \(x_{i}\), \(y_{i}\) are ordered from the lowest to the highest within each gamble (see Appendix 2 for an example of the presentation format and the experiment instructions).

The dataset comprises 28 gamble pairs (14 of them presented in a split form, see Appendix 3 for an overview of all gambles and the respective descriptive statistics of gamble choices), implying 4 × 28 = 112 choice situations faced by each subject over four repetitions, that is, 54 × 4 × 28 = 6048 choice situations observed in total, 3024 of them in a split form. Note, however, that the gambles in our study are relatively specific in that no certain outcomes are included and the values and probabilities of high outcomes are relatively similar between the gambles within a decision. The subjects received, on average, a €19.1 cash reward (including a €5.0 show-up reward) for an approximately 90-minute session, leading to an average reward of €12.8 per hour.

3.2 Structural modeling

We apply structural methods to jointly estimate several core parameters of the EU and RDU frameworks. In particular, we use maximum likelihood estimation (MLE) to determine the most likely parameter values to have generated the given dataset within the specification bounds. In addition, the log-likelihood of the MLE allows us to measure the goodness of fit of the respective frameworks.

RDU includes subjective values of outcomes and subjective weights of probabilities. For the subjective values, we use a power value function in all models. We thus add the α parameter as the first one in the list of estimable parameters. For the probability weighting, we examine four weighting function specifications given that, firstly, the probability weighting function appears to be central for considering the coalesced- versus split-form data and, secondly, the explanatory power of the RDU model depends on the function specifications and the corresponding interaction effects (e.g., Stott 2006). We thus add the γ (and δ, where relevant) parameter to the list of estimable parameters (see Appendix 7 for a full list of parameters in each model).

In addition, we extend the RDU to accommodate stochastic behavior by applying an exponential specification of the choice ruleFootnote 7 of Luce (1959) (see also Rieskamp 2008). The exponential specification of Luce’s choice rule is defined as

$$p\left( {R,S} \right) = \frac{{{\text{e}}^{\phi v\left( R \right)} }}{{{\text{e}}^{\phi v\left( R \right)} + {\text{e}}^{\phi v\left( S \right)} }},$$
(10)

where \(p\left( {R,S} \right)\) stands for the probability of choosing the risky gamble \(R\) over the safe gamble \(S\) and the sensitivity parameter \(\phi > 0\) specifies how sensitively the model reacts to differences between the subjective values \(V\left( R \right)\) and \(V\left( S \right)\) of the gambles \(R\) and \(S\), respectively (Rieskamp 2008; Nillson et al. 2011). Consequently, given that it is feasible to structurally estimate all specified parameters jointly with MLE, we add the \(\phi\) argument to the list of estimable parameters.

Using the choice rule, we attempt to quantify the goodness of fit of the RDU model predictions given the actual choices between the gambles. A useful tool for the purpose is the deviation measure G2, expressed as

$$G^{2} = - 2\mathop \sum \limits_{i = 1}^{N} \ln \left[ {f_{i} \left( {y|\theta } \right)} \right],$$
(11)

with \(i\) denoting the choice among gambles and \(N\) denoting the total number of gambles. \(f_{i} \left( {y|\theta } \right)\) expresses the probability that the RDU model with its parameter values \(\theta\) predicts a choice \(y\), such that \(f_{i} \left( {y|\theta } \right) = p_{i} \left( {R,S} \right)\) if the gamble \(R\) is chosen and \(f_{i} \left( {y|\theta } \right) = 1 - p_{i} \left( {R,S} \right)\) if the gamble \(S\) is chosen. Low values of \(G^{2}\) are indicators of good choice predictions and, hence, a good fit of the RDU model (Rieskamp 2008). A directly related measure of fit is the Akaike information criterionFootnote 8 (AIC) that additionally adjusts for the complexity of the model (namely, the number of parameters in a given specification) and thus allows comparing the explanatory power of differing models. The AIC is defined as

$${\text{AIC}} = G^{2} + 2n,$$
(12)

where n stands for the number of free parameters in a model (Akaike 1973). As a rule, an AIC difference of \(\Delta {\text{AIC}} > 1 0\) for two given models strongly favours the model with the lowest AIC measure over the other (Burnham and Anderson 2002).

4 Results

4.1 Fitting coalesced- and split-form data to RDU and EU

For the main hypothesis, we analyze the AIC measures for the EU specification using the coalesced-form and split-form data separately. This approach allows us to consider whether, in some settings, one could find grounds for preferring the one form of presenting the gambles over the other. The linear case of the RDU one-parameter weighting function (with restricting the γ parameter to unity, thus attaining EUW) shows clear evidence that the split-form gamble pairs provide a better fit than coalesced-form gambles in the EU specification (coalesced form \({\text{AIC}}^{\text{EUW}}\) = 3870, split form \({\text{AIC}}^{\text{EUW}}\) = 3816, see \(\Delta {\text{AIC}}\) in Table 1). However, the results also indicate that for the RDU specifications, the coalesced form data provide a better fit than the split-form data. Namely, three out of the four core RDU specifications show a \(\Delta {\text{AIC}}\) that notably exceeds 10, thus providing a result in favor of the coalesced-form data, while the fourth (TKW) shows a \(\Delta {\text{AIC}}\) smaller than 10, providing an insufficiently conclusive result.

Table 1 Comparison of the model fit with 28 gamble pairs considered

Interestingly, the opposite holds when examining directly comparable one-type gambles with split highest and lowest branches exclusively (see \(\Delta {\text{AIC}}\) in Table 2 and the list of gambles in Appendix 3): The model that uses split-form data outperforms the same model that uses equivalent coalesced-form data in all four RDU specifications. The question of whether experiments ought to include split-form gamble pairs rather than their coalesced equivalents to ensure a more accurate preference elicitation thus finds some confirmation here but is still open for further examination.

Table 2 Comparison of the model fit with 16 one-type gamble pairs considered

The four RDU specifications outperform the EU specification in terms of the model fit (see \(\Delta {\text{AIC}}^{\text{EUW}}\) in Table 2) for the coalesced-form data. For the split-form data of the directly comparable one-type gambles, the differences between the \({\text{AIC}}^{\text{EUW}}\) and the respective measures of fit for the four RDU specifications no longer exhibit significant differences.

Finally, the one-parameter P1W outperforms TKW and the two-parameter P2W outperforms GEW in terms of the fit measures in practically all of the examined models. Therefore, in the following discussion of results, we focus on these best-performing models in particular; see Appendices 46 for the full results of all examined RDU specifications and model versions with various control variables. In total, we use four versions of the model (referred to as models M1 to M4, see Appendix 7 for summary).

4.2 Comparing RDU parameter estimates for the coalesced- and split-form data

The results for the weighting function parameters \(\gamma\) and \(\delta\) of model M1 in Table 3 are unequivocal. There exist significant differences at a 99% significance level between these parameters when considered in the contexts of coalesced- and split-form gamble pairs. Indeed, also the direction of the differences is consistent over the RDU specifications, with \(\gamma_{\text{split}}\) remaining significantly higher than \(\gamma_{\text{coa}}\) and \(\delta_{\text{split}}\) remaining significantly lower than \(\delta_{\text{coa}}\) (all at a 99% significance level, according to Wald tests).

Table 3 Results of the models M1 and M2 in two best-performing RDU specifications

These results are, however, worthy of attention not only for their significant differences between the respective coalesced- and split-form γ and δ, but also due to the absolute values of these weighting function parameters. While the coalesced form allows maintaining the \(\gamma_{\text{coa}}\) value significantly different (lower) from unity, thus confirming the established RDU predictions (Wald tests, p values < 0.001), the split-form \(\gamma_{\text{split}}\) reveals an unusual picture. The estimated parameter value is indifferent from unity in the one-parameter P1W specification and significantly different (higher) from unity in the two-parameter P2W specification (Wald tests, p values < 0.001). These results thus conflict with the usual results of the RDU framework, namely, the aforementioned \(0 < \gamma < 1\).

Model M2 might provide additional insights into this uncommon result. This model, compared to M1, considers the weighting function exclusively, keeping the utility function parameters unchanged. According to Chow tests,Footnote 9 the respective weighting parameter values estimated in M1 and M2 do not significantly differ between these model versions for the one-parameter P1W specifications (neither for the corresponding coalesced-, nor split-form parameters), while the two-parameter P2W specification estimates in M2 are different from the estimates in M1 at a 95% significance level for \(\gamma_{\text{split}}\), \(\delta_{\text{coal}}\) and \(\delta_{\text{split}}\).

The results of M2 confirm the results of M1 and reveal the familiar trend of significantly different weighting function parameters \(\gamma\) and \(\delta\) in the coalesced-form as compared to the split-form gamble pairs (Wald tests, p values < 0.001), with \(\gamma_{\text{coa}}\) significantly different (lower) from unity (Wald tests, p values < 0.001). In this case, the \(\gamma_{\text{split}}\) parameters report practically no curvature, which implies that no weighting of probabilities for split-form gamble pairs could be identified in the P1W case (essentially making it equivalent to the EUW case), while for the two-parameter function P2W, the elevation parameter \(\delta_{\text{split}}\) alone assures that the weighting function is curved.Footnote 10

As depicted in Fig. 3 where we plot the results of M1 graphically, the results of the one-parameter weighting functions are admittedly easier to interpret than the two-parameter weighting functions. The implications of splitting—namely, that split-form gambles result in less probability weighting than coalesced-form gambles—hold for both P1W and P2W, but the particular shapes of the split and coalesced P2W should be interpreted with caution, as the \(\delta\) values are quite high and indicate more pessimism in the split form.

Fig. 3
figure 3

Illustration of the weighting-function results of M1 in Table 3 above. The solid lines refer to coalesced-form data and the dashed lines refer to split-form data in the one-parameter (black, P1W) and two-parameter (gray, P2W) Prelec weighting functions, respectively

Although there are some limitations to our results, we can indeed conclude that violations of coalescing for split-form gamble pairs in the RDU framework not only explain differences in the model fit, but also affect the subjective weighting function. Namely, the use of the split-form gamble pairs appears to change gamble choices and considerably diminish the weighting as compared to the coalesced-form gambles.

Note that we also check the robustness of these results by considering further weighting function specifications and adjusting the RDU parameter estimates to further variables. Our analysis (see M1, M2 in Appendix 4 and M3, M4 in Appendices 5 and 6) shows that the familiar trend of significantly different weighting function parameters \(\gamma\) and \(\delta\) for the coalesced- versus split-form gambles remains strong and consistent across all RDU specifications and considered models. Taken together, our results indicate that one source of the more pronounced deviation from EU could be comparison difficulties caused by the coalesced form of the presented gambles.

5 Discussion and limitations

The results of this paper are insightful in a number of ways. Firstly, we have shown that the fit of EU is indeed better if the gamble pairs are presented in a split form. This result indicates that the split form improves prescriptive decision analysis. For example, we could conclude that one should rather use split-form than coalesced-form gamble pairs when advising decision making.

Secondly, we have found evidence for significant differences in magnitude between the \(\gamma\) parameters in the RDU weighting functions for coalesced- and split-form gamble pairs in the given dataset. Meanwhile, the somewhat mixed evidence regarding the logically independent elevation parameter \(\delta\) calls for further examination of this property. Note that the interaction effects between \(\delta\) and \(\alpha\) cannot be ruled out in this setting and are a possible source of the mixed evidence. Still, although the values of \(\delta\) are quite high (particularly for coalesced gambles), there is no indication for strong effects of cross-parameter compensations: \(\gamma\) and \(\delta\) do not vary considerably in concert, and \(\delta\) values are relatively stable in all model specifications. Whether this result reflects some more fundamental theoretical issues or is merely a method artifact is due to further research.

Thirdly, it appears that presenting gamble pairs in a split form changes gamble choices as to bring the RDU closer to the EU. This is the case not only for the measures of model fit, but also for the properties of the weighting function. With the curvature parameter \(\gamma\) largely closer to unity (and often not significantly different from unity) for split-form than for coalesced-form gambles, the results indicate that the subjects tend to pay less account to subjective probability weighting when evaluating split-form gamble pairs of a certain type.Footnote 11 They act comparatively more “normatively” than expected and thus put one of the cornerstones of the RDU (and by implication, CPT) into question.

Note, however, that the gambles in our study were quite specific in that values and probabilities of the high outcomes were relatively similar between the gambles within a decision. Note also that no certain outcomes were included that might drive the typically stronger deviations from linearity of the weighting function. We thus acknowledge the fact that the particular shape of the weighting function might be different with additional gambles and tests for generality are due to further research.

6 Conclusions

The results of this paper invite its readers to carefully rethink RDU and its perspective on the subjective probability weighting or, more particularly, on the stability of the probability weighting function against the editing of lotteries. We have provided some reasons to conclude that the non-linearity in the weighting function might be more pronounced in result of coalescing. That is, probability weighting does not necessarily appear to be an ingrained feature, but rather a result facilitated by processing difficulties.

What do our results imply for utility theory and its applications? Firstly, one could argue in favor of using gamble pairs in a split form and employing EU as a decision criterion, given that it performs rather well for the split-form pairs. However, real-life gambles do not always occur in a split form. Therefore, a second recommendation could be to increasingly employ other theories that imply splitting effects. Other models, like the transfer of attention exchange (TAX) model or the rank-affected multiplicative weights (RAM) model, could potentially be rivals to the EU, RDU and CPT (Birnbaum and Chavez 1997; Birnbaum 1999, 2008). Thirdly, one could continue employing RDU (and by implication, CPT), albeit cautiously, knowing that the splitting has a fairly intelligible effect on the weighting function.

Much remains to be done still. Firstly, because this paper concerns coalesced- and split-form gamble pairs in the gain domain exclusively, we advise extending the scope of the forthcoming experiments to also include mixed and loss-only gambles (to examine the splitting effects in the respective parameters for the loss domain, including the loss aversion parameter). Secondly, because the construction of splitting appears to have an influence on the resulting model fit, we advise extending the binary concept of splitting and examining a variety of directly comparable coalesced versus differently split datasets. Thirdly, because the interpretations of the psychological reasons behind the splitting effects are still manifold, we advise gathering further insights from parallel lines of research, such as neuroeconomics and others.