ANALYSISContingent valuation estimates for environmental goods: Validity and reliability
Introduction
The contingent valuation method (CV) has become a recognised tool for estimating non-market values coherent with underlying Hicksian postulates. This method allows eliciting monetary values of environmental goods and services that are not reflected in market prices. Over 10,000 CV studies have been undertaken world-wide (Haab et al., 2020) and, even when academics seem to overstate the relevance of their estimates on policy making (Tinch et al., 2019), policy-makers and practitioners already rely on this method to assess policy impacts and inform litigation cases (Hanley and Czajkowski, 2019).
Despite its pragmatic acceptance in public policy evaluation, the application of CV-based estimates in decision-making remains controversial, as critics still question the accuracy (or even the capacity) of CV to reflect non-market values (Hausman, 2012; Maas and Svorenčík, 2017; McFadden and Train, 2017). According to Bishop and Boyle (2019), assessing CV accuracy involves evaluating both validity and reliability of the method. Assessing validity relates to bias, while reliability is about variance. As Mitchell and Carson (1989) state, “the validity of a measure is the degree to which it measures the theoretical construct under investigation. This construct is, in the nature of things, unobservable; all we can do is to obtain imperfect measurements of that entity” (p. 190). The construct that CV studies try to elicit is one of the Hicksian surplus values. It implies that the statistical expectation of an estimated mean of WTP equals to the true mean for the good being valued. Critics of the CV method argue that elicited values suffer from serious biases that make the method invalid to quantify non-market values. Yet, the increasing number of CV applications and their uptake in policy evaluation point towards a situation where a value is better than no-value. The main obstacle to assess validity is the fact that the true market value is unknown (Reiling et al., 1990) and there is no benchmark against which to evaluate whether values are biased or not. Economists supporting CV argue that a meticulous design and analysis of the valuation application can minimize bias (Johnston et al., 2017). Moreover, due to the lack of benchmarks, studies testing for validity have focused mainly on comparing CV estimates with values obtained from estimates from alternative non-market valuation methods and testing whether their findings are consistent against assumed economic principles (Bishop et al., 1995; Vossler and Kerkvliet, 2003; Vossler et al., 2003; Bishop and Boyle, 2019).
Evaluating accuracy of CV estimates also requires analysing their reliability (Carson et al., 1997). Reliability refers to the variability of the estimates inferred in CV applications. If the variance of the estimated mean is larger than the true population variability, the estimated WTP might not be trustworthy (Reiling et al., 1990). In this paper, we focus on a specific form of reliability, known as temporal reliability or temporal stability1 (McConnell et al., 1998). Temporal stability of values can also be considered an indicator of reliability because the values can be replicated by follow-up applications (Bliem et al., 2012), and shows the capacity to recover the same estimates for the same value at different points in time if preferences are invariant. Indeed, preference stability is one of the key assumptions of marginalist economic analysis. While this has been contested by behavioural economists even for financial assets and traders, in environmental valuation preference instability has been used as evidence of lack of reliability, claiming that stated preference methods do not elicit preferences, rather they co-construct them (Bateman et al., 2008).
Several authors have pointed out that tests of validity and reliability ought to be more routinely integrated in stated preference methods (Rakotonarivo et al., 2016; Oerlemans et al., 2016; Wang et al., 2016). Building on this, the objective of this paper is to provide additional evidence about the accuracy of the CV method to estimate well-founded non-market values, assessing both validity and reliability. This study focuses on WTP estimates for ecological status improvements coming from two CV surveys delivered in 2010 and 2017 for the case of the Mar Menor coastal lagoon (SE Spain).
We approach the assessment of CV accuracy using two approaches. First, taking advantage of a natural experiment, we are able to compare WTP estimates and real monetary contributions to a crowdfunding initiative that pursues the lagoon restoration. We follow recent literature that points out crowdfunding as a source of revealed preferences suitable to non-market valuation (Frey, 2019). In this manner, we perform a criterion validity test to assess whether CV estimates are free of hypothetical bias, observing whether individuals participating in the real and hypothetical market behave similarly (Johnston, 2006). In addition, reliability of the CV estimates is also evaluated analysing the statistical distributions of WTP and donations to the crowdfunding initiative. Second, by analysing construct validity at the individual level, we extend the analysis reported in Perni et al. (2020), studying individual behaviour and testing whether changing preferences are driven either by strictly rational economic behaviour or by other factors not consistent with economic theory. For the purpose of measuring rational changes in individual preferences, we construct a synthetic panel dataset from the two contingent valuation surveys using the Propensity Score Matching method (PSM) (Stuart, 2010).
Our paper contributes to the understanding of the accuracy of contingent valuation estimates in two ways. First, we are capable of carrying out a criterion validity test of hypothetical values obtained from a CV application. Contrary to Champ and Bishop (2001), we use an incentive compatible payment vehicle (i.e., a tax in water bill) and compare it with real donations to a crowdfunding initiative. To the best of our knowledge, this is the first study in which hypothetical responses from CV surveys using taxes as payment vehicle have been compared to crowdfunding monetary contributions. Second, we are capable of estimating how much of the variability of CV estimates across time are due to changes consistent with rational economic premises.
The rest of the paper is structured as follows. In section 2 we briefly describe the environmental good subject to valuation. Next, we present the main concepts of validity and reliability that we follow in this research and describe the methodology, econometric approach and hypothesis tested. Section 4 presents the results and, finally, Section 5 summarizes the main findings and conclusions of the study.
Section snippets
The Mar Menor: environmental and management context
The Mar Menor is a hypersaline coastal lagoon that covers 135 km2 (Fig. 1). It is located in the Segura River Basin District (SRBD) in southern Spain. A 20 km long and 100–900 m wide sand bar separates the lagoon from the Mediterranean Sea. Dunes, crypto-wetlands and coastal saltpans configure the nearby landscape of this natural site, providing habitat for singular flora and fauna adapted to conditions of high temperature (ranging from 12 °C and 30 °C) and high salinity (ranging from 42 to 47
Assessing non-market valuation applications
The assessment of the accuracy of contingent valuation is an area where multiple terminology co-exists. Here we follow the nomenclature proposed by Bishop and Boyle (2019) distinguishing between validity and reliability, where validity focuses on the mean of the estimates and reliability focuses on their variance. Regarding the former, these authors take the three types of validity criteria put forward by Mitchell and Carson (1989) to assess whether economic valuation applications produce
Criterion validity and reliability of WTP estimates
Table 4 summarizes the main hypothesis and statistics regarding the WTP evaluation following criterion validity. Recall that only positive values for WTP from the full dataset are used (i.e., data matching does not apply in this section). WTP2010+ and WTP2017+ do not present significant statistical differences in means respect to CrowdPay according to the Student's t-test, failing to reject the null-hypothesis and signaling that both CV valuation applications are valid according to this
Discussion and conclusion
This research deals with both aspects of accuracy of CV-based WTP estimates, validity and reliability. In particular, it uses a natural experiment setting benefiting from a crowdfunding campaign defined in comparable terms, as the two CV surveys were employed to infer WTP to improve the Mar Menor coastal lagoon. This criterion validity test is unique in the literature on economic valuation. Regarding reliability, we benefit from having the same CV survey in different moments in time which is
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
The work described in this paper was carried out within the framework of the 20912/PI/18 project, financed by “Fundación Séneca-Agencia de Ciencia y Tecnología de la Región de Murcia”. The authors are solely responsible for the content of the paper. The views expressed are purely those of the authors and may not in any circumstances be regarded as stating an official position of the European Commission.
References (70)
- et al.
Assessment of real and perceived cost-effectiveness to inform agricultural diffuse pollution mitigation policies
Land Use Policy
(2021) - et al.
Willingness-to-volunteer and stability of preferences between cities: estimating the benefits of stormwater management
J. Environ. Econ. Manag.
(2020) - et al.
The aggregation of environmental benefit values: welfare measures, distance decay and total WTP
Ecol. Econ.
(2006) - et al.
Learning design contingent valuation (LDCV): NOAA guidelines, preference learning and coherent arbitrariness
J. Environ. Econ. Manag.
(2008) - et al.
Temporal stability of individual preferences for river restoration in Austria using a choice experiment
J. Environ. Manag.
(2012) - et al.
The impact of the bird flu on public willingness to pay for the protection of migratory birds
Ecol. Econ.
(2008) Propensity Scoring. International Encyclopedia of the Social & Behavioral Sciences
(2015)- et al.
The mar Menor lagoon (SE Spain): a singular natural ecosystem threatened by human activities
Mar. Pollut. Bull.
(2007) The integrated territorial investment (ITI) of the Mar Menor as a model for the future in the comprehensive management of enclosed coastal seas
Ocean Coast. Manag.
(2018)Is hypothetical bias universal? Validating contingent valuation responses using a binding public referendum
J. Environ. Econ. Manag.
(2006)