Randomness is inherently imprecise

doi:10.1016/j.ijar.2021.06.018

International Journal of Approximate Reasoning

Volume 141, February 2022, Pages 28-68

https://doi.org/10.1016/j.ijar.2021.06.018 Get rights and content

Abstract

We use the martingale-theoretic approach of game-theoretic probability to incorporate imprecision into the study of randomness. In particular, we define several notions of randomness associated with interval, rather than precise, forecasting systems, and study their properties. The richer mathematical structure that thus arises lets us, amongst other things, better understand and place existing results for the precise limit. When we focus on constant interval forecasts, we find that every sequence of binary outcomes has an associated filter of intervals it is random for. It may happen that none of these intervals is precise—a single real number—which justifies the title of this paper. We illustrate this by showing that randomness associated with non-stationary precise forecasting systems can be captured by a constant interval forecast, which must then be less precise: a gain in model simplicity is thus paid for by a loss in precision. But imprecise randomness can't always be explained away as a result of oversimplification: we show that there are sequences that are random for a constant interval forecast, but never random for any computable (more) precise forecasting system. We also show that the set of sequences that are random for a non-vacuous interval forecasting system is meagre, as it is for precise forecasting systems.

Introduction

This paper documents the first steps in our attempt to incorporate imprecision into the study of algorithmic randomness. What this means is that we want to allow for, give a precise mathematical meaning to, and study the mathematical consequences of, associating randomness with interval rather than precise probabilities and expectations. We will see that this is a non-trivial problem, argue that it leads to surprising conclusions about the nature of randomness, and discover that it opens up interesting and hitherto uncharted territory for mathematical and even philosophical investigation. We believe that our work provides (the beginnings of) a satisfactory answer to questions raised by a number of researchers [19], [20], [22], [61] about frequentist and ‘objective’ aspects of interval, or imprecise, probabilities.

To explain what it is we're after, consider an infinite sequence $ω = (z_{1}, \dots, z_{n}, \dots)$ , whose components $z_{k}$ are either zero or one, and are typically considered as successive outcomes of some experiment. When do we call such a sequence random? There are many notions of randomness, and many of them have a number of equivalent definitions [1], [4]. We will focus here essentially on Martin-Löf randomness, computable randomness, and Schnorr randomness.

The randomness of a sequence ω is typically associated with a probability measure on the sample space of all such infinite sequences, or—which is essentially equivalent due to Ionescu Tulcea's extension theorem [5, Theorem II.9.2]—with a so-called forecasting system φ that associates with each finite sequence of outcomes $(x_{1}, \dots, x_{n})$ the (conditional) expectation $φ (x_{1}, \dots, x_{n}) = E (X_{n + 1} | x_{1}, \dots, x_{n})$ for the next, as yet unknown, outcome $X_{n + 1}$ .¹^,² This $φ (x_{1}, \dots, x_{n})$ is the (precise) forecast for the value of $X_{n + 1}$ after observing the values $x_{1}, \dots, x_{n}$ of the respective variables $X_{1}, \dots, X_{n}$ . The sequence ω is then typically called ‘random’ when it passes some countable number of randomness tests, where the collection of such randomness tests depends of the forecasting system φ.

An alternative and essentially equivalent approach to defining randomness, going back to Ville [54], sees each forecast $φ (x_{1}, \dots, x_{n})$ as a fair price for—and therefore a commitment to bet on—the as yet unknown next outcome $X_{n + 1}$ after observing the first n outcomes $x_{1}, \dots, x_{n}$ . The sequence ω is then ‘random’ when there is no ‘allowable’ strategy for getting infinitely rich by exploiting the bets made available by the forecasting system φ along the sequence, without borrowing. Betting strategies that are made available by the forecasting system φ are called supermartingales. Which supermartingales are considered ‘allowable’ differs in various approaches [1], [4], [18], [28], [39], but typically involves some (semi)computability requirement—we discuss relevant aspects of computability in Section 4. Technically speaking, randomness then requires that all allowable non-negative supermartingales (that start with unit value) should remain bounded on ω.

It is this last, martingale-theoretic, approach that seems to lend itself most easily to allowing for interval rather than precise forecasts, and therefore to allowing for ‘imprecision’ in the definition of randomness. As we explain in Sections 2 and 3, an interval, or ‘imprecise’, forecasting system φ associates with each finite sequence of outcomes $(x_{1}, \dots, x_{n})$ a (conditional) expectation interval $φ (x_{1}, \dots, x_{n})$ for the next, as yet unknown, outcome $X_{n + 1}$ . The lower bound of this interval forecast represents a supremum acceptable buying price, and its upper bound an infimum acceptable selling price, for the next outcome $X_{n + 1}$ . This idea rests firmly on the common ground between Walley's [60] theory of coherent lower previsions and Shafer and Vovk's [45], [46] game-theoretic approach to probability that we have helped establish in recent years, through our research on imprecise stochastic processes [13], [16]; see also Refs. [2], [53] for more details on so-called ‘imprecise probabilities’. These theoretical developments allow us here to associate supermartingales with an interval forecasting system, and therefore in Section 5 to extend a number of existing notions of randomness to allow for interval, rather than precise, forecasts: we include in particular Martin-Löf randomness and computable randomness [1], [4], [18], [39]. In Section 6, we also extend Schnorr randomness [1], [4], [18], [39] to allow for interval forecasts. We then show in Section 7 that our approach allows us to extend to interval forecasting some of Dawid's [7] well-known work on calibration, and to establish a number of interesting ‘limiting frequencies’ or computable stochasticity results.

We believe the discussion becomes especially interesting in Section 8, where we start restricting our attention to constant, or stationary, interval forecasts. We see this as an extension of the more classical accounts of randomness, which typically consider a forecasting system with constant forecast $\frac{1}{2}$ —corresponding to flipping a fair coin. As we have by now come to expect from our experience with so-called imprecise probability models, when we allow for interval forecasts, a mathematical structure appears that is much more interesting than the rather simpler case of precise forecasts would lead us to suspect. In the precise case, a given sequence may not be random for any stationary forecast, but as we will see, in the case of interval forecasting there typically is a filter of intervals that a given sequence is random for. Furthermore, as we show in Section 9 by means of explicit examples, this filter may not have a smallest element, and even when it does, this smallest element may be a non-vanishing interval: this is the first cornerstone for our argument that randomness is inherently imprecise.

The examples in Section 9 all involve sequences that are random for some computable non-stationary precise forecast, but can't be random for a stationary forecast unless it becomes interval-valued, or imprecise. This might lead to the suspicion that this imprecision is perhaps only an artefact, which results from looking at non-stationary phenomena through an imperfect stationary lens. We show in Section 10 that this suspicion is unfounded: there are sequences that are random for a stationary interval forecast, but that aren't random for any computable (more) precise forecast, be it stationary or not. This further corroborates our claim that randomness is, indeed, inherently imprecise.

Finally, in Section 11, we argue that ‘imprecise’ randomness is an interesting extension of the existing notions of ‘precise’ randomness, because it is equally rare: just as for precise stationary forecasts, the set of all sequences that are random for a non-vacuous stationary interval forecast is meagre. This, we will argue, indicates that the essential distinction lies not between precise and imprecise forecasts (or randomness), but between non-vacuous and vacuous ones, and provides further evidence for the essentially ‘imprecise’ nature of the randomness notion.

We conclude with a short discussion of the significance of our findings, and of possible avenues for further research. In order to maintain focus, we have decided to move all technical proofs of auxiliary results about computability and growth functions to an appendix. We have also, as much as possible, tried to make sure that our more complicated and technical proofs in the main text are preceded by informal arguments, in order to help the reader build some intuition about why and how they work.

Section snippets

A single interval forecast

The dynamics of making a single forecast can be made very clear, after the fashion first introduced by Shafer and Vovk [45], [46], by considering a simple game, with three players, namely Forecaster, Sceptic and Reality. The game involves an initially unknown outcome in the set ${0, 1}$ , which we will denote by X. To stress that it is unknown, we call it a variable, and use upper-case notation.

Game

Single forecast of an outcome X

In a first step, the first player, Forecaster, specifies an interval bound $I = [\underline{p}, \overline{p}] \subseteq [0, 1]$ for the

Interval forecasting systems and imprecise probability trees

We now consider a sequence of repeated versions of the forecasting game in the previous section. At each successive stage $k \in N$ , Forecaster presents an interval forecast $I_{k} = [{\underline{p}}_{k}, {\overline{p}}_{k}]$ for the unknown outcome variable $X_{k}$ . This effectively allows Sceptic to choose any gamble $f_{k} (X_{k})$ such that ${\overline{E}}_{I_{k}} (f_{k}) \leq 0$ . Finally, Reality then chooses a value $x_{k}$ for $X_{k}$ , resulting in a gain in capital $f_{k} (x_{k})$ for Sceptic. This gain $f_{k} (x_{k})$ can, of course, be negative, resulting in an actual decrease in Sceptic's capital.

Basic computability results

We now give a brief survey of a number of basic notions and results from computability theory, and a few derived results, that are relevant to the developments in this paper. For a much more extensive discussion, we refer, for instance, to Refs. [29], [35].

Random sequences in an imprecise probability tree

With all the scaffolding now in place, we're finally ready to associate various notions of randomness with a forecasting system φ—or in other words, with an imprecise probability tree. We want to be able to introduce and study several versions of randomness, each connected with a particular class of test supermartingales—capital processes for Sceptic when she starts with unit capital and never borrows.

Schnorr randomness in an imprecise probability tree

Next, we concentrate on extending the notion of Schnorr randomness to our present context. We begin with a definition borrowed from Schnorr's seminal work [39], [40].

Definition 3 Growth function

We call a map $ρ : N_{0} \to N_{0}$ a growth function if

(i)
it is recursive;
(ii)
it is non-decreasing: $(\forall n_{1}, n_{2} \in N_{0}) (n_{1} \leq n_{2} \Rightarrow ρ (n_{1}) \leq ρ (n_{2}))$ ;
(iii)
it is unbounded.

We say that a real-valued map

μ : N_{0} \to R

is computably unbounded if there is some growth function ρ such that

{\lim \sup}_{n \to \infty} [μ (n) - ρ (n)] > 0

, or equivalently,

\inf_{m \in N_{0}} \sup_{n \geq m} [μ (n) - ρ (n)] > 0 .

In what follows, it will

Consistency results

We now turn to a number of important consistency results for the various randomness notions we have introduced. In the rest of this section, unless explicitly mentioned to the contrary, $A$ is an arbitrary but fixed set of allowable test processes.

Constant interval forecasts

From now on, we turn to the special case where the interval forecasts $I \in I$ are constant, and don't depend on the already observed outcomes. This leads to a generalisation of the classical case $I = {\frac{1}{2}}$ of the randomness associated with a fair coin.

In the rest of this section, unless explicitly stated to the contrary, $A$ is any arbitrary but fixed set of allowable test processes. For any interval $I \in I$ , we denote by $γ_{I} : S \to I$ the corresponding so-called stationary forecasting system that assigns the same

Imprecise randomness due to non-stationarity

Our work on imprecise Markov chains [12], [14], [16], [25], [51] has taught us that in some cases, we can very efficiently compute tight bounds on expectations in non-stationary precise Markov chains, by replacing them with their stationary imprecise versions. Similarly, in statistical modelling, when learning from data sampled from a distribution with a varying (non-stationary) parameter, it seems hard to estimate the time sequence of its values, but we may be more successful in learning about

Imprecision can't be explained away

The examples in the previous section illustrate that randomness associated with a non-stationary precise forecasting system can also be ‘described’ as randomness for a simpler, stationary but then necessarily imprecise, forecasting system. This observation might lead to the suspicion that all stationary imprecise forms of randomness can be ‘explained away’ as such simpler representations of non-stationary but precise forms of randomness. This would imply that the imprecision—or loss of

The meagreness of random sequences

In yet another beautiful paper we came across while researching this topic, Muchnik, Semenov and Uspensky [31] showed that the set of all paths that correspond to a precise stationary forecast is meagre.

The essence of their argument is the following. They call a path ω lawful if there is some algorithm that, given as input any situation s on the path ω, outputs a non-trivial finite set $R (s)$ of situations $t ⊐ s$ that strictly follow that situation s, such that one of these ‘extensions’ t is also on

Conclusion

The probability of an event is often seen as a precise, or at least ideally precise, number. Apart from a few notable exceptions in earlier accounts [17], [24], [43], [44], a more determined investigation into reasons for letting go of this idealisation, and into mathematical ways to achieve this, only started in the later decades of the 20th century [26], [27], [41], [42], [45], [60]; see also Refs. [2], [53] for overviews. Most of this work centred on the decision-theoretic and epistemic

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

This paper has taken a very long time to write. Our research on this topic started with discussions between Gert and Philip Dawid about what prequential interval forecasting would look like, during a joint stay at Durham University in late 2014. Gert, and Jasper who joined in late 2015, wrote an early prequential version of the present paper during a joint research visit to the University of Strathclyde and Durham University in May 2016, trying to extend the results in Refs. [57], [58], [59] to

References (62)

Klaus Ambos-Spies et al.
Randomness in computability theory
Contemp. Math.
(2000)
Gordon Belot
Failure of calibration is typical
Stat. Probab. Lett.
(2013)
Laurent Bienvenu et al.
On the history of martingales in the study of randomness
J. Électron. Hist. Probab. Stat.
(2009)
P. Billingsley
Probability and Measure
(1995)
Alonzo Church
On the concept of a random sequence
Bull. Am. Math. Soc.
(1940)
A. Philip Dawid
The well-calibrated Bayesian
J. Am. Stat. Assoc.
(1982)
A. Philip Dawid
Statistical theory: the prequential approach
J. R. Stat. Soc. A
(1984)
A. Philip Dawid
Calibration-based empirical probability
Ann. Stat.
(1985)
A. Philip Dawid
Self-calibrating priors do not exist: comment
J. Am. Stat. Assoc.
(1985)

A. Philip Dawid et al.

Prequential probability: principles and properties

Bernoulli

(1999)

Jasper De Bock et al.

Sum-product laws and efficient algorithms for imprecise Markov chains

Gert de Cooman et al.

Imprecise probability trees: bridging two theories of imprecise probability

Artif. Intell.

(2008)

Gert de Cooman et al.

Imprecise Markov chains and their limit behaviour

Probab. Eng. Inf. Sci.

(January 2009)

Gert de Cooman et al.

A pointwise ergodic theorem for imprecise Markov chains

Gert de Cooman et al.

Imprecise stochastic processes in discrete time: global models, imprecise Markov chains, and ergodic theorems

Int. J. Approx. Reason.

(2016)

A.P. Dempster

Upper and lower probabilities induced by a multivalued mapping

Ann. Math. Stat.

(1967)

Rodney G. Downey et al.

Algorithmic Randomness and Complexity

(2010)

Pablo I. Fierens

An extension of chaotic probability models to real-valued variables

Int. J. Approx. Reason.

(2009)

Pablo I. Fierens et al.

A frequentist understanding of sets of measures

J. Stat. Plan. Inference

(2009)

Terrence L. Fine

On the apparent convergence of relative frequency and its implications

IEEE Trans. Inf. Theory

(1970)

Igor I. Gorban

The Statistical Stability Phenomenon

(2016)

Andrei N. Kolmogorov

Three approaches to the quantitative definition of information

Probl. Inf. Transm.

(1965)

Bernard O. Koopman

The axioms and algebra of intuitive probability

Ann. Math. (2)

(1940)

Thomas Krak et al.

Imprecise continuous-time Markov chains

Int. J. Approx. Reason.

(2017)

Henry E. Kyburg

Higher order probabilities and intervals

Int. J. Approx. Reason.

(1988)

Isaac Levi

The Enterprise of Knowledge

(1980)

Leonid A. Levin

On the notion of a random sequence

Sov. Math. Dokl.

(1973)

Ming Li et al.

An Introduction to Kolmogorov Complexity and Its Applications

(1993)

Per Martin-Löf

The definition of random sequences

Inf. Control

(1966)

Andrei A. Muchnik et al.

Mathematical metaphysics of randomness

Theor. Comput. Sci.

(1998)

Cited by (11)

Strictly frequentist imprecise probability
2024, International Journal of Approximate Reasoning
Strict frequentism defines probability as the limiting relative frequency in an infinite sequence. What if the limit does not exist? We present a broader theory, which is applicable also to data that exhibit diverging relative frequencies. In doing so, we develop a close connection with the theory of imprecise probability: the cluster points of relative frequencies yield a coherent upper prevision. We show that a natural frequentist definition of conditional probability recovers the generalized Bayes rule. Finally, we prove constructively that, for a finite set of elementary events, there exists a sequence for which the cluster points of relative frequencies coincide with a prespecified set which demonstrates the naturalness, and arguably completeness, of our theory.
Green finance and carbon reduction: Implications for green recovery
2022, Economic Analysis and Policy
Citation Excerpt :
For example, ‘systematic error variance shared across measurement items induced by the functionality of the same technique or cause’ was used in this research to indicate the aforementioned bias. Harman’s one-factor test was also employed to examine this bias (de Cooman and De Bock, 2021). An exploratory factor analysis (EFA) containing all of the observed items was carried out using this approach.
The construction sector, which accounts for around 40% of the total annual global greenhouse gas (GHG) emissions, has been charged with decreasing its energy usage and environmental impact in compliance with the rules laid down in the Paris Agreement of 2016. Using panel data analysis, this research examines the link between green property financing and the construction sector’s emissions of carbon dioxide in 100 developed and emerging nations. The findings of this research indicate that, while the growth of green property financing is strongly and adversely connected to the industry’s carbon dioxide emissions worldwide, this relationship is especially pronounced in emerging countries. This finding is a crucial one for these nations, as several of them are currently experiencing fast and unrestrained population growth, which has led them to increase their levels of oil consumption. Measures aimed at sustaining this advancement during the COVID-19 epidemic are critical, since the crisis has reduced the accessibility of green funding, thus slowing, and even reversing, the progress made in the years prior.
On the (dis)similarities between stationary imprecise and non-stationary precise uncertainty models in algorithmic randomness
2022, International Journal of Approximate Reasoning
Citation Excerpt :
Section 3 explains how to bet on a single variable in a way that agrees with an interval forecast, and extends this idea to a betting game/protocol on an infinite sequence of variables by defining betting strategies—which are basically supermartingales—that again agree with such interval forecasts, and that avoid borrowing. After clarifying in Section 4 when such betting strategies are implementable, we present in Section 5 the imprecise-probabilistic martingale-theoretic notions of (weak) Martin-Löf, computable and Schnorr randomness introduced in earlier work [2,18], and we discuss some of their properties. At this point, in Section 6, we are ready to formulate our central claim/result: for each of the above-mentioned martingale-theoretic notions of randomness, an infinite sequence is random for an interval forecast if and only if it is random for some specific related non-computable non-stationary precise uncertainty model.
The field of algorithmic randomness studies, amongst other things, what it means for infinite binary sequences to be random for some given uncertainty model. Classically, martingale-theoretic notions of randomness involve precise uncertainty models, and it is only recently that imprecision has been introduced into this context. As a consequence, the investigation into how imprecision alters our view on martingale-theoretic random sequences has only just begun. In this contribution, where we allow for non-computable uncertainty models, we establish a close and surprising connection between precise and imprecise uncertainty models in this randomness context. In particular, we show that there are stationary imprecise models and non-computable non-stationary precise models that have the exact same set of random sequences. We also give a preliminary discussion of the possible implications of our result for a statistics based on imprecise probabilities, and shed some light on the practical (ir)relevance of both imprecise and non-computable precise uncertainty models in that context.
Probability and statistics: Foundations and history. Special Issue in honor of Glenn Shafer
2022, International Journal of Approximate Reasoning
RANDOMNESS AND IMPRECISION: FROM SUPERMARTINGALES TO RANDOMNESS TESTS
2023, arXiv
Imprecision in Martingale-Theoretic Prequential Randomness
2023, Proceedings of Machine Learning Research

View all citing articles on Scopus

View full text

Randomness is inherently imprecise

Abstract

Introduction

Section snippets

A single interval forecast

Single forecast of an outcome X

Interval forecasting systems and imprecise probability trees

Basic computability results

Random sequences in an imprecise probability tree

Schnorr randomness in an imprecise probability tree

Consistency results

Constant interval forecasts

Imprecise randomness due to non-stationarity

Imprecision can't be explained away

The meagreness of random sequences

Conclusion

Declaration of Competing Interest

Acknowledgements

Randomness in computability theory

Contemp. Math.

Failure of calibration is typical

Stat. Probab. Lett.

On the history of martingales in the study of randomness

J. Électron. Hist. Probab. Stat.

Probability and Measure

On the concept of a random sequence

Bull. Am. Math. Soc.

The well-calibrated Bayesian

J. Am. Stat. Assoc.

Statistical theory: the prequential approach

J. R. Stat. Soc. A

Calibration-based empirical probability

Ann. Stat.

Self-calibrating priors do not exist: comment

J. Am. Stat. Assoc.

Prequential probability: principles and properties

Bernoulli

Sum-product laws and efficient algorithms for imprecise Markov chains

Imprecise probability trees: bridging two theories of imprecise probability

Artif. Intell.

Imprecise Markov chains and their limit behaviour

Probab. Eng. Inf. Sci.

A pointwise ergodic theorem for imprecise Markov chains

Imprecise stochastic processes in discrete time: global models, imprecise Markov chains, and ergodic theorems

Int. J. Approx. Reason.

Upper and lower probabilities induced by a multivalued mapping

Ann. Math. Stat.

Algorithmic Randomness and Complexity

An extension of chaotic probability models to real-valued variables

Int. J. Approx. Reason.

A frequentist understanding of sets of measures

J. Stat. Plan. Inference

On the apparent convergence of relative frequency and its implications

IEEE Trans. Inf. Theory

The Statistical Stability Phenomenon

Three approaches to the quantitative definition of information

Probl. Inf. Transm.

The axioms and algebra of intuitive probability

Ann. Math. (2)

Imprecise continuous-time Markov chains

Int. J. Approx. Reason.

Higher order probabilities and intervals

Int. J. Approx. Reason.

The Enterprise of Knowledge

On the notion of a random sequence

Sov. Math. Dokl.

An Introduction to Kolmogorov Complexity and Its Applications

The definition of random sequences

Inf. Control

Mathematical metaphysics of randomness

Theor. Comput. Sci.