Introduction

One of the most robust empirical regularities in studies of human perception is the central tendency (or regression) effect: across various perceptual domains, estimates of stimulus magnitude are consistently biased towards the center of the magnitude distribution (Hollingworth, 1910; Stevens & Greenbaum, 1966). One prominent explanation for the central tendency effect is that sensory signals are “regularized” to reduce the influence of noise. Intuitively, when the signal-to-noise ratio is very low, then the best guess is the center of the magnitude distribution. When the signal-to-noise ratio is very high, then the best guess will ignore the magnitude distribution and only use the signal. At intermediate levels, the best guess will be somewhere in between the signal and the center of the magnitude distribution. These intuitions can be formalized in a Bayesian framework (Petzschner, Glasauer, & Stephan, 2015), where the prior expresses the effect of the magnitude distribution, and the likelihood expresses the effect of sensory noise. Bayes’ rule prescribes how these two sources of information should be optimally combined.

Central to the Bayesian framework is the role of noise in predicting the strength of the central tendency effect. However, typically we cannot directly measure the noise level. This empirical gap is frequently filled by ad hoc assumptions or by fitting free parameters to the data, rendering the Bayesian framework possibly unfalsifiable (Marcus & Davis, 2013; Jones & Love, 2011). The same issue vexes non-Bayesian models (e.g., Ratcliff & McKoon 2018, 2020) that implicitly or explicitly make assumptions about sensory or cognitive noise.

One way to get around this issue is to collect other measures that are hypothesized to capture the magnitude of noise. Even if we cannot measure the noise level directly, we can make predictions about the relationships between indirect measures and magnitude estimates as a test of model predictions. We pursue this strategy here, using subjective confidence and response variability as auxiliary measures to triangulate the effects of noise on perception. To make our predictions precise, we develop a simple Bayesian model of magnitude estimation and derive a number of generic predictions from it (i.e., predictions that don’t depend strongly on the parameter values). However, our goal is not to advocate for the Bayesian model versus alternative models, but rather to formalize some hypothetical regularities which (if true) would need to be satisfied by any model of magnitude estimation.

Our model makes several key predictions, elaborated in the next section: (i) confidence should decrease with sensory noise; (ii) the sensitivity of magnitude estimates to actual magnitudes should increase with subjective confidence and decrease with sensory noise; (iii) the central tendency effect should decrease with confidence and increase with sensory noise; (iv) both confidence and sensitivity should decrease with stimulus magnitude, assuming sensory noise that grows with stimulus magnitude; and (v) response variability should increase with sensory noise and decrease with confidence when prior uncertainty is high relative to sensory noise.

We test these model predictions in two ways. First, we implement a new large-scale magnitude estimation experiment. We elicit both magnitude estimates and subjective confidence. A key feature of our experiment is that we exogenously vary the objective stimulus, the mean of the stimulus distribution, and the magnitude of sensory noise. We collect these data because, as explained below, existing data sets on magnitude estimation and subjective confidence lack sufficient variation in some of the key elements of Bayesian models. Second, moving beyond our new experiment, we re-analyze data from several earlier studies that measured both continuous reports of stimulus magnitude and subjective confidence. Although these earlier studies are limited in several ways (which motivated our new experiment), the results from our re-analysis provide converging evidence for our model’s main predictions.Footnote 1 Whether or not one accepts the Bayesian framework, these findings provide constraints on any model of confidence and central tendency in perceptual judgment.

Theoretical framework

To motivate our empirical predictions, we will first lay out a theoretical framework based on a simple Bayesian estimation problem, which mirrors the experimental tasks given to subjects.

Bayesian estimation

We model a task in which subjects are asked to estimate the magnitude of a stimulus x from a noise-corrupted signal s = x + 𝜖, where \(\epsilon \sim \mathcal {N}(0,\sigma _{\epsilon }^{2})\) is Gaussian-distributed sensory noise. For concreteness, consider the task of estimating the number of objects on a screen. In this case x is the true number of objects and s is the observer’s sensory representation of the number.

If the subject has a Gaussian prior over the magnitude, \(x\sim \mathcal {N}(\mu _{x}, {\sigma _{x}^{2}})\), then the posterior is also Gaussian:

$$ \begin{array}{@{}rcl@{}} P(x|s) = \mathcal{N}(x; \lambda s + (1-\lambda) \mu_{x} , \sigma_{\hat{x}}^{2}), \end{array} $$
(1)

where the posterior variance is

$$ \begin{array}{@{}rcl@{}} \sigma_{\hat{x}}^{2} = (1-\lambda){\sigma_{x}^{2}}, \end{array} $$
(2)

and the sensitivity is

$$ \lambda=\frac{{\sigma_{x}^{2}}}{{\sigma_{x}^{2}}+\sigma_{\varepsilon}^{2}}, $$
(3)

which takes values between 0 and 1. Intuitively, the magnitude estimate will be more sensitive (higher λ) to the signal when the sensory noise variance (\(\sigma _{\epsilon }^{2}\)) is small relative to the prior uncertainty (\({\sigma _{x}^{2}}\)).

We follow most of the literature in assuming that the prior approximates either the experienced magnitude distribution or the instructed distribution, depending on the paradigm. For analytical tractability, we assume a Gaussian prior even when the experimentally controlled stimulus distribution is non-Gaussian (e.g., uniform). In these cases, the mean is assumed to be equal to the center of the uniform distribution (e.g., Roach, McGraw, Whitaker, & Heron, 2017; Acerbi, Wolpert, & Vijayakumar, 2012). Our experimental predictions do not depend strongly on the choice of prior variance, provided that it is a fixed function of the magnitude distribution range.

We posit that the subjective estimate \(\hat {x}\) is chosen to minimize expected loss \({\mathscr{L}}(\hat {x},x)\) conditional on the signal:

$$ \begin{array}{@{}rcl@{}} \hat{x} = \underset{\hat{x}}{\arg\min} \mathbb{E}[\mathcal{L}(\hat{x},x)|s]. \end{array} $$
(4)

The optimal subjective estimate under the quadratic loss \({\mathscr{L}}(\hat {x},x) = (x - \hat {x})^{2}\) equals the posterior mean (see, e.g., Berger 1985):

$$ \begin{array}{@{}rcl@{}} \hat{x} = \lambda s+ (1-\lambda) \mu_{x}. \end{array} $$
(5)

For the Gaussian estimation problem described here, this prediction holds true for several other loss functions. For example, the Bayes-optimal estimator for the absolute loss \({\mathscr{L}}(\hat {x},x) = |x - \hat {x}|\) is the posterior median, and the Bayes-optimal estimator for the relaxed 0-1 loss (\({\mathscr{L}}_{\kappa }(\hat {x},x) = 1\) if \(|x - \hat {x}|\geq \kappa \), 0 otherwise) is the posterior mode in the limit \(\kappa \rightarrow 0\). Since the mean, median, and mode for a Gaussian distribution have the same value, our predictions are invariant across these choices of loss function.

Subjective confidence

Magnitude estimation tasks require subjects to report a single point estimate of the magnitude, but Bayesian models hypothesize that subjects are representing an entire distribution over magnitudes. Subjective confidence judgments can potentially provide a probe of this distributional representation. According to the Bayesian confidence hypothesis (Meyniel, Sigman, & Mainen, 2015; Pouget, Drugowitsch, & Kepecs, 2016; Sanders, Hangya, & Kepecs, 2016; Aitchison, Bang, Bahrami, & Latham, 2015; Fleming & Daw, 2017; Rahnev, Koizumi, McCurdy, D’Esposito, & Lau, 2015), subjective confidence corresponds to the posterior probability that an action is optimal (in this context, the probability that the subjective estimate equals the objective stimulus magnitude). Note that for continuous magnitudes, the probability that the point estimate \(\hat {x}\) equals the objective magnitude x is 0. However, we can evaluate the posterior probability that the true magnitude falls within an infinitesimally small region around the posterior mean estimate. In the limit, this probability becomes the density of the normally distributed estimate, evaluated at its mean: \(\frac {1}{\sigma _{\hat {x}} \sqrt {2/\pi }} \equiv c\). Thus, we see that Bayesian confidence (c) for the Gaussian estimation problem is inversely proportional to the posterior standard deviation. This result motivates the use of the standard deviation as a measure of “cognitive uncertainty” (Enke & Graeber, 2020).Footnote 2 Most importantly, note that under a constant prior variance \({\sigma _{x}^{2}}\), Bayesian confidence c decreases in the variance of sensory noise, \(\sigma _{\epsilon }^{2}\). Because our experimental design controls the prior variance as argued above, we will in the following interpret Bayesian confidence as an approximate measure of sensory noise.

Predictions

Using this framework, we can now lay out a set of theoretical predictions, which we will empirically test in subsequent sections.

Prediction 1

Sensitivity (λ) monotonically increases with confidence (c).

This prediction follows from the expression relating sensitivity to confidence:

$$ \begin{array}{@{}rcl@{}} \lambda = 1 - \frac{2}{\pi c^{2} {\sigma_{x}^{2}}}. \end{array} $$
(6)

Although the model quantity λ is not directly observable, we can approximate it empirically as described in the experimental analysis below. The next prediction concerns the relationship between confidence and the central tendency effect, a straightforward corollary of Prediction 1 because \(\frac {\partial \hat {x}}{\partial \mu _{x}} = 1 - \lambda \). As discussed above, we follow previous work in assuming that the prior is determined by the experienced or instructed magnitude distribution, and hence μx is the mean of the magnitude distribution.

Prediction 2

The central tendency effect (\(\frac {\partial \hat {x}}{\partial \mu _{x}}\)) monotonically decreases with confidence.

Here we have defined the central tendency effect formally as the degree to which the perceptual estimate changes with the average stimulus magnitude.

We now state the causal analogs of these predictions based on exogenous changes in sensory noise.

Prediction 3

An exogenous increase in sensory noise (\(\sigma ^{2}_{\epsilon }\)), which reduces subjective confidence, decreases sensitivity.

Prediction 4

An exogenous increase in sensory noise increases the central tendency effect.

Next, we theoretically explore how the strength of the central tendency effect depends on the stimulus magnitude. A well-known phenomenon is that response variability increases with stimulus magnitude according to Weber’s law. A common interpretation is that the signal-to-noise ratio decreases with stimulus magnitude, due either to a non-linear transformation of magnitude (e.g., Petzschner & Glasauer, 2011; Fechner 1860; Nieder & Miller 2003; Stevens 1961; Roach et al., 2017) or to magnitude-dependent scaling of sensory noise (e.g., Treisman 1964; Gibbon 1977; Gibbon & Church 1981). For concreteness, we examine the implications of the latter assumption:

$$ \frac{\partial \sigma_{\varepsilon}^{2}}{\partial x} > 0 \Rightarrow \frac{\partial c}{\partial x} < 0 \quad \text{and} \quad \frac{\partial \lambda}{\partial x} < 0 $$
(7)

Prediction 5

As the stimulus magnitude increases, confidence and sensitivity decrease. The latter effect implies a stronger central tendency effect for larger magnitudes.

Finally, we turn to response variability. Theoretically, variability is affected both by the effect of sensory noise on estimates (which increases variability) and the counteracting central tendency effect (which decreases variability; see Enke & Graeber 2020 ). The expression for response variability is given by:

$$ \begin{array}{@{}rcl@{}} \text{Var}(\hat{x}|x) = \lambda^{2} \sigma_{\epsilon}^{2} = \left( \frac{{\sigma^{2}_{x}}}{{\sigma^{2}_{x}} + \sigma^{2}_{\epsilon}} \right)^{2} \sigma_{\epsilon}^{2}. \end{array} $$
(8)

The relationship between response variability and other quantities depends on the degree of prior uncertainty relative to sensory noise.

Prediction 6

When sensory noise is small relative to prior uncertainty (\(\sigma ^{2}_{\epsilon } < {\sigma ^{2}_{x}}\)), response variability increases in sensory noise variance and decreases in confidence. When sensory noise is large relative to prior uncertainty (\(\sigma ^{2}_{\epsilon } > {\sigma ^{2}_{x}}\)), response variability decreases in sensory noise variance and increases in confidence.

Intuitively, when sensory noise is zero, the response is exactly equal to the stimulus and there is no residual variability. In the limit of large sensory noise, the response equals the prior mean and again there is no residual variability. Thus there is response variability only for intermediate values of sensory noise.

Experiment

Although a considerable number of studies have implemented magnitude estimation tasks while also measuring subjective confidence, none of these studies explicitly vary all of the variables that are of interest in light of our theoretical predictions above. Specifically, we require a study setup that features (i) variation in stimuli; (ii) a meaningful degree of exogenous variation in the mean of the stimulus distribution; (iii) exogenous variation in sensory noise; and (iv) a large sample size to allow for sufficiently powered statistical analyses of the relationship between confidence and the precise mechanics of central tendency. Earlier studies typically had sample sizes of fewer than 50 participants (see Table 2).

Materials and methods

Participants

We recruited 300 participants from Amazon Mechanical Turk (MTurk). All participants gave informed consent prior to testing. To ensure that participants fully understood the experiment, they completed a comprehension check immediately after the instructions. Participants who failed the comprehension check were asked to leave the study and were compensated ($0.5) for their time. Participants who passed the comprehension check proceeded with the experiment and were paid ($4.5) for their participation. Participants also had the opportunity to earn a performance- based bonus payment ($1) if their estimate on a randomly chosen trial was within 2 of the stimulus magnitude. 79 out of 300 participants received the bonus payment. The hourly rate was between $9/hour and $11/hour depending on whether they received a bonus. The experiment was approved by the Harvard Institutional Review Board.

Stimuli

The stimuli were arrays of black dots on a white background. The number of dots ranged between 15 and 65. Note that we did not control for density, so we cannot make a strong claim about numerosity per se using these stimuli.

Each participant completed six blocks with a total of 240 arrays. For each participant, the average stimulus magnitude within each block was randomly drawn from a discrete uniform distribution on the integers from 30 to 50 (sampled without replacement). For each block, the within-block distribution was a uniform distribution centered at the average stimulus magnitude, ranging between ± 15 (sampled with replacement). This procedure ensured that our trials feature substantial variation both in actual stimuli within-block and in average stimuli across blocks. We conceptualize the latter as the perceptual prior and use the variation to manipulate the central tendency. To make the prior salient to subjects, they were informed about the average stimulus value within a block at the beginning of each block, before making any estimates.

The stimulus duration on each trial was either 100 ms or 2000 ms, intermixed randomly within each block; each of the two duration conditions appeared on half of the trials in each block. This feature of our experiment allows us to leverage exogenous variation in sensory noise and confidence to test Prediction 3. This is motivated by the common assumption that sensory evidence is accumulated across time, such that the signal-to-noise ratio is higher for longer durations (Inglis & Gilmore, 2013; Cheyette & Piantadosi, 2019; 2020).

Procedure

As illustrated in Fig. 1, each trial began with the presentation of a fixation cross at the center of the screen for 500 ms. Then a random dot array was presented for either 100 ms or 2000 ms. Next, participants had 10 s to type in their numerosity estimate using the number pad on the keyboard. Participants were then prompted to provide a confidence rating by clicking on a discrete slider from 0-10.

Fig. 1
figure 1

Illustration of the experiment. Each trial begins with a fixation cross, followed by a dot array. Participants then report their numerosity estimate and confidence

Data analysis

We excluded trials on which participants did not respond. We analyzed the data using linear mixed-effects models. Model 1 regresses subjective magnitude estimates on the true stimulus magnitude value (Stimulus) and the average stimulus magnitude in a block (AveStim), with random effects for the intercept, Stimulus, and AveStim grouped by participants. In the absence of a central tendency effect, the regression coefficient of the true stimulus should be one, while the coefficient of average stimulus should be zero. The central tendency effect is indicated by a stimulus coefficient of less than one and an average stimulus coefficient of greater than zero.

To explore the role of sensory noise and confidence for the central tendency effect, we ran Model 2. Here, we added as additional regressors subjective confidence, the interaction between confidence and the stimulus, and the interaction between confidence and average stimulus, and added random effects for subjective confidence, the interaction between confidence and the stimulus, and the interaction between confidence and average stimulus grouped by participants. As derived above, our hypothesis is that higher confidence (as a consequence of lower sensory noise) is associated with a higher responsiveness to the stimulus and a lower responsiveness to the average stimulus, meaning that the first interaction effect should be positive and the second one negative.

In addition, Model 2 also accounts for our exogenous variation in stimulus duration, which we conceptualize as exogenous variation in sensory noise that translates into confidence. We include as regressors a binary Condition indicator (0 for the 100 ms condition, 1 for the 2000 ms condition), the interaction between the stimulus and Condition, and the interaction between average stimulus magnitude and Condition, and random effects for Condition, the interaction between the stimulus and Condition, and the interaction between average stimulus magnitude and Condition grouped by participants.

To see how the proportional regression to the prior changes in stimulus magnitude, we specified Model 3. Here, we regress an empirical estimate of λ on stimulus magnitude with random effects for the intercept and stimulus magnitude grouped by participants, and hypothesize a negative coefficient.Footnote 3

Turning to the analysis of response variability, in Model 4, we regressed Variability (response standard deviationFootnote 4 across multiple repetitions of the same magnitude) on Confidence (averaged across stimulus repetitions), with random effects for the intercept and Confidence grouped by participants. In Model 5, we regressed Variability on Condition, with random effects for the intercept and Condition grouped by participants. For better interpretability, we standardized the Confidence coefficients.

Results

Preliminaries

Our study rests on two prerequisites: (i) The existence of a central tendency effect; and (ii) variation in subjective confidence as a function of stimulus duration.

The results of Model 1 (column 1 of Table 1) confirm the existence of a central tendency effect in our data: the coefficient of the true stimulus is substantially smaller than one [F(1,71405) = 1372.2,p < 0.0001], and the coefficient of the average stimulus is considerably larger than zero [F(1,71405) = 71.56,p < 0.0001]. To visualize the central tendency effect, we plotted subjective estimates as a function of stimulus magnitude. As shown in Fig. 2a, the slopes of the estimation functions are considerably smaller than 1, and the level of bias in the estimation functions increases in the mean of the stimulus distribution.

Table 1 Regression coefficients and standard errors for stimulus estimates in the new data set (Models 1 and 2)
Fig. 2
figure 2

a Subjective estimates as a function of objective stimulus magnitude, shown separately for three within-block average stimulus ranges. b Confidence as a function of objective stimulus magnitude, shown separately for different stimulus duration conditions. c Subjective estimates as a function of objective stimulus magnitude, shown separately for low confidence (confidence levels 0-3) and high confidence (confidence levels 7-10). d Subjective estimates as a function of objective stimulus magnitude, shown separately for different stimulus durations. e Response standard deviation as a function of confidence. f Response standard deviation under short and long stimulus duration. Error bars indicate 95% confidence intervals

Our regression analysis revealed that confidence was higher for longer durations [t(71405) = 16.20,p < 0.0001]Footnote 5, consistent with the hypothesis that lower sensory noise registers as higher confidence (Fig. 2b).

Confidence and the central tendency effect

We begin our main analysis by testing Predictions 1 and 2. The results of Model 2 (column 2 of Table 1) show that the central tendency effect is strongly moderated by subjective confidence. The positive interaction effect of confidence and stimulus implies that, for every standard deviation of confidence, the responsiveness of subjective estimates to the stimulus value increases by 5 percentage points [t(71399) = 6.71,p < 0.0001]. This result confirms Prediction 1. Similarly, the negative interaction effect of confidence and average stimulus shows that confident subjects place substantially lower weight on the mean of the stimulus distribution [t(71399) = − 3.18,p < 0.01], supporting Prediction 2 that subjective estimates are pulled towards the prior mean to a greater extent when confidence is low. These two interaction effects correspond to the central predictions of the model about the role of sensory noise for the central tendency effect. Figure 2c visualizes the results by revealing stronger central tendency effects when confidence is low, i.e., the implied response curve is substantially flatter.

Sensory noise and the central tendency effect

To confirm our correlational findings, we leverage the exogenous variation in confidence that is induced by stimulus duration. As shown in column 2 of Table 1, longer stimulus durations increased the effect of objective stimulus magnitude [t(71399) = 20.91,p < 0.0001] and decreased the effect of average stimulus magnitude [t(71399) = − 12.24,p < 0.0001], consistent with a weaker central tendency effect when sensory noise is lower. In combination with the strong effect of duration on confidence, these results suggest that sensory noise causally influences both confidence and central tendency effects, confirming Prediction 3 and Prediction 4. Figure 2d visualizes these patterns by showing that the central tendency effect is considerably more pronounced under shorter stimulus duration, i.e., the implied response curve is flatter.

Sensory noise and the magnitude-dependent central tendency effect

We find strong support for Prediction 5. First recall from Fig. 2b that confidence decreases in the stimulus magnitude, indicating larger sensory noise. Our model therefore predicts proportionally larger compression (lower λ) at higher magnitudes. The results of Model 3 show that the empirical analog of λ indeed decreases in stimulus magnitude [t(69087) = − 17.17,p < 0.0001], meaning that participants display proportionally stronger central tendency effects at higher stimulus magnitudes. We can also intuitively gauge this effect in Fig. 2a, where estimates are further away from the 45-degree line at higher magnitudes.

Sensory noise and response variability

Figures 2e and f visualize the results on the relationship between sensory noise and response variability. Note that Prediction 6 implies a hump-shaped relationship between response variability and confidence: response variability should increase at low confidence levels (sensory noise variance high relative to prior uncertainty) but decrease at high confidence levels (sensory noise variance relatively low). Figure 2e strongly supports this distinctive prediction of the Bayesian model. A plausible assumption is that prior uncertainty usually exceeds sensory noise variance. Correspondingly, we find that response variability decreases across the upper half of the confidence range when sensory noise variance is low relative to prior uncertainty (Fig. 2e), and results from our regression Model 4 show a negative effect of confidence [t(11979) = − 6.93,p < 0.0001]. Turning to the causal manipulation, we see from Model 5 results that variability is higher under shorter stimulus duration [t(18181) = − 5.36,p < 0.0001; also see Fig. 2f]. Thus the data clearly support the model predictions about the relationship between variability, sensory noise and subjective confidence.

Discussion

Taken together, the results establish support for all six predictions spelled out above. In particular, (i) lower sensory noise (high confidence) is associated with a higher sensitivity of estimates to the true stimulus; (ii) lower sensory noise (higher confidence) is associated with a lower sensitivity of estimates to the average stimulus; (iii) these results hold both in correlational analyses and when we exogenously manipulate confidence; (iv) central tendency increases with the stimulus magnitude; and (v) response variability decreases with confidence and increases under shorter stimulus duration.

Re-analysis of earlier studies

While our experiment has the advantage of being specifically tailored to investigate the predictions associated with the core elements of a Bayesian model, we sought to test the validity of our hypothesis more generally. Owing to the recent publication of the Confidence Database (Rahnev et al., 2020), we were able to additionally address the relationship between confidence and the central tendency effect by re-analyzing data from several earlier studies.

Materials and methods

Data sets

Out of a total of 145 studies contained in the Confidence Database (Rahnev et al., 2020), we identified six studies that elicited continuous reports of both stimulus magnitude and subjective confidence (Table 2). Two of these studies use circular stimuli (motion direction and grating orientation), and hence are not relevant for analyses of magnitude-dependent noise. Nonetheless, these studies are still informative about the interplay between objective magnitude and confidence in determining subjective judgments.

Table 2 Descriptive information about the included studies

Data analysis

To measure the central tendency effect, we specified the average stimulus magnitude as the cumulative mean of all across-block stimulus magnitudes that preceded the current trial. We excluded each participant’s first trial in regressions involving average stimulus, because first trials do not have preceding trials. In our analysis, we excluded estimates that were clearly out of range and trials on which participants did not comply with the experiment rules.Footnote 6

To assess the relationship between confidence and central tendency, we constructed linear mixed-effects models akin to the ones used to analyze our own experiment, except that here we do not have treatment conditions that exogenously vary sensory noise and confidence through stimulus duration.

Results and discussion

Subjective estimates correlate more strongly with objective stimulus magnitude under high confidence

Consistent with the model and our own experimental results, we document significant positive interactions between stimulus and confidence in all data sets (Table 3): AB17 [t(15254) = 4.66,p < 0.0001], DB18 [t(14432) = 3.78,p < 0.001], DB19 [t(8563) = 3.58,p < 0.001], DB20 [t(9651) = 6.16,p < 0.0001], RZ14 [t(8974) = 22.84,p < 0.0001], and SP17 [t(14059) = 11.57,p < 0.0001]. In other words, subjective estimates correlate more strongly with the objective stimulus magnitude when confidence is high, as predicted by Bayesian models of perception.

Table 3 Regression coefficients and standard errors for the re-analysis of stimulus estimates (Model 2)

In contrast to the consistent interaction between objective stimulus magnitude and confidence across studies, the interaction between average stimulus magnitude and confidence is not consistently observed across studies (Table 3). We find a significant effect in only one of the studies [AB17, t(15254) = − 3.88,p < 0.001], which has the predicted sign. We conjecture that this is due to insufficient variation in the average stimulus magnitude in prior experiments. In our new study reported above, we remedied this limitation by exploring a wider range of average stimulus magnitudes to be sufficiently statistically powered.

The low variation in stimulus magnitude in the earlier studies also makes our analysis of Prediction 5 (Model 3) difficult. We hypothesized that the magnitude of central tendency increases in stimulus magnitude, which should produce a negative coefficient for stimulus magnitude. The results of Model 3 are mixed, with a negative coefficient for only two studies, and neither of them is significant [AB17, t(12454) = − 0.77,p = 0.44; DB19, t(8564) = − 0.31,p = 0.75]. Again, this null result is to be expected given insufficient variation in (average) stimulus magnitude. Our new experiment deliberately remedies this shortcoming.

Response variability decreases with confidence

Finally, we again find strong evidence for confidence-dependent variability. Five studies show a significantly negative coefficient: AB17 [t(150) = − 2.07,p < 0.05], DB18 [t(86) = − 3.96,p < 0.001], DB19 [t(466) = − 5.62,p < 0.0001], RZ14 [t(2538) = − 10.58,p < 0.0001], and SP17 [t(4169) = − 4.60,p < 0.0001]. These results are very similar to the ones observed in our own experimental data.

General discussion

Using data from earlier studies and a new data set, we have established a relationship between confidence and central tendency that conforms with (but is not necessarily unique to) the generic predictions of Bayesian models. First, we showed that the central tendency effect is lower on high confidence trials. Second, we showed that when sensory noise was exogenously increased via a stimulus duration manipulation, the central tendency effect increased and confidence decreased, demonstrating the causal role of sensory noise. Third, we showed a stronger central tendency effect at higher magnitudes, which is in line with subjective confidence decreasing in stimulus magnitude. Fourth, we showed that across-trial variability in responses decreased in subjective confidence and increased in sensory noise whenever prior uncertainty is relatively large.

Our findings bridge several disparate lines of research on confidence and central tendency effects. Some theories assert that confidence judgments in perceptual decision making tasks reflect the posterior probability of being correct—the Bayesian confidence hypothesis (Meyniel et al., 2015; Pouget et al., 2016; Sanders et al., 2016; Fleming & Daw, 2017; Rahnev et al., 2015). While past experimental work on the Bayesian confidence hypothesis has focused on discrete choice tasks (Aitchison et al., 2015), here we analyzed continuous report tasks, which allowed us to relate confidence judgments to the central tendency effect. Our finding that this relationship held across several different stimulus domains (time, visual numerosity, auditory numerosity, line length, motion direction, and grating orientation) lends support to the generality of our conclusions.

While our findings are specifically consistent with the Bayesian confidence hypothesis, they might also be compatible with alternative models. For example, Adler and Ma (2018) developed several models that map probability representations of uncertainty onto confidence in a non-Bayesian way, and presented experimental evidence that some of these models outperformed the Bayesian model in predicting confidence judgments. Li and Ma (2020) developed a different non-Bayesian model, which determined confidence based on the difference in probability between the top two hypotheses. For our purposes, all of these non-Bayesian models share the key property that confidence is lower when uncertainty (due to sensory noise) is greater.

Some authors have argued that previous data supporting Bayesian models can be explained by simpler heuristic models. For example, Huttenlocher, Hedges, and Vevea (2000) tested a Bayesian model of perceptual judgment similar to the one analyzed here, but their conclusions were questioned by later work showing that the same patterns of behavior could be fit by a model that simply reports an average of recent stimulus magnitudes (Duffy & Smith, 2020). This alternative model has limited explanatory scope for the data we discuss here, because it is silent about the role of confidence in generating judgment and the effects of stimulus duration. Similarly, their account does not explain the patterns of a stronger central tendency effect at higher stimulus magnitudes and of predictable heterogeneity in response variability.

A second line of research bridged by our theory is on the central tendency effect. Recent research has shown that cognitive load (which ostensibly increases sensory noise) strengthens the central tendency effect, broadly consistent with Bayesian models of perception. For example, Allred, Crawford, Duffy, and Smith (2016) found that asking participants to memorize six-digit numbers (high load condition) increased the central tendency effect in the estimation of line length, compared to a low load condition in which participants memorized two-digit numbers. Similarly, Olkkonen, McCarthy, and Allred (2014) found that increasing chromatic noise or the delay between stimulus presentation and estimation increased the central tendency effect in a color estimation task. Relatedly, there is evidence that the central tendency effect is stronger when sensory information is less reliable or when the magnitude distribution is more concentrated around the center (Ashourian & Loewenstein, 2011; Allred, Crawford, Duffy, & Smith, 2016; Olkkonen, McCarthy, & Allred, 2014; Huttenlocher, Hedges, & Vevea, 2000), again consistent with Bayesian models. Our study took this line of research one step further, showing that confidence judgments respond to an exogenous manipulation of sensory noise, while explaining significant additional response variance not explained by noise alone.

Our study is limited in several ways that can be addressed by future research. One is that we did not provide a detailed information processing model of behavior. This was a deliberate choice: our aim was to test generic principles rather than the predictions of specific information processing models, which would inevitably require a number of ad hoc assumptions. A second limitation is that we did not explore the effects of magnitude distribution shape on the central tendency effect.

The elementary psychophysical regularities studied here may have broader implications. Economists have begun to consider the significance of noisy cognition for a range of phenomena, including sensitivity to risk and ambiguity, belief updating, and survey responses (Enke & Graeber, 2020; Payzan-LeNestour & Woodford, 2020; Frydman & Jin, 2019; Woodford, 2019; Gabaix, 2019). Models of noisy cognition may therefore hold promise in unifying empirical evidence from disparate fields, explaining not only lower-level perceptual processes but also higher-level cognition.