Probability Axioms and Set Theory Paradoxes

Herman, Ari; Caughman, John

doi:10.3390/sym13020179

Open AccessArticle

Probability Axioms and Set Theory Paradoxes

by

Ari Herman

^* and

John Caughman

Fariborz Maseeh Department of Mathematics and Statistics, Portland State University, Portland, OR 97201, USA

^*

Author to whom correspondence should be addressed.

Symmetry 2021, 13(2), 179; https://doi.org/10.3390/sym13020179

Submission received: 15 December 2020 / Revised: 9 January 2021 / Accepted: 20 January 2021 / Published: 22 January 2021

(This article belongs to the Section Mathematics)

Download Versions Notes

Abstract

:

In this paper, we show that Zermelo–Fraenkel set theory with Choice (ZFC) conflicts with basic intuitions about randomness. Our background assumptions are the Zermelo–Fraenekel axioms without Choice (ZF) together with a fragment of Kolmogorov’s probability theory. Using these minimal assumptions, we prove that a weak form of Choice contradicts two common sense assumptions about probability—both based on simple notions of symmetry and independence.

Keywords:

set theory; probability; axiom of choice

1. A Puzzle

We begin with a paradox involving the Axiom of Choice (AC) and an infinite set of fair coins. An early version of this result first appeared as a problem in the American Mathematical Monthly [1]; a version closer to ours can be found in [2].

Let

I = 2^{ω}

denote the set of all binary-valued functions on

ω = {0, 1, 2, \dots}

. Let

ψ : I \to I

be a randomly constructed function. By this, we mean that for each

r \in I

,

n \in ω

,

ψ (r) (n)

is determined by a fair coin toss. (If you prefer, you may also interpret the elements of I as the binary expansion of a real number

0 \leq r \leq 1

. We note that each dyadic rational,

0 < r < 1

, can occur in two ways; e.g.,

\frac{1}{2} = 0.1 \bar{0} = 0.0 \bar{1}

).) We now ask: “If

\hat{r} \in I

is chosen at random (i.e.,

\hat{r} (n)

is determined by a fair coin toss for each

n \in ω

), is it possible to guess the value of

ψ (\hat{r})

given the values

ψ (r)

for all

r \neq \hat{r}

?”. The intuitively obvious answer is “no”. Since each value of

ψ

was chosen independently, the restriction of

ψ

to

I \ {\hat{r}}

should carry no information about

ψ (\hat{r})

. Hence, if we are limited to information about

ψ ↾ (I \ {\hat{r}})

, the odds of guessing

ψ (\hat{r})

correctly should be 0, no matter what strategy we employ.

But are they? Consider the following argument, which can be fully formalized within Zermelo–Fraenkel set theory with Choice (ZFC). Define an equivalence relation on

I^{I}

by setting

f \sim g

if and only if

f (r) = g (r)

for all but finitely many

r \in I

. By AC, there exists an

S \subseteq I^{I}

that intersects each ∼-class in one point. For each

g \in I^{I}

, let

g^{🟉}

be the unique function in

S \cap [g]

, and let

Δ_{g}

denote the finite set

{r \in I : g (r) \neq g^{🟉} (r)}

. Note that

g^{🟉}

is uniquely determined by the restriction of g to any cofinite subset of I. Hence,

ψ^{🟉}

is determined by

ψ ↾ (I \ {\hat{r}})

, so we can employ the strategy of guessing that

ψ (\hat{r}) = ψ^{🟉} (\hat{r})

. This strategy fails if and only if

\hat{r} \in Δ_{ψ}

. Since

Δ_{ψ}

is a finite set depending only on

ψ

(not on

\hat{r}

), any randomly chosen

\hat{r} \in I

is almost certain not to lie in

Δ_{ψ}

. Thus, using only information about

ψ ↾ (I \ {\hat{r}})

, our strategy is almost certain to guess the correct value of

ψ (\hat{r})

. This paradox invites us to reexamine the assumptions underlying ZFC.

2. Introduction

2.1. Axioms and Mathematical Intuition

Over the last 100 years, set theory (ZFC) has become widely accepted as a foundation for mathematics. If mathematics is to be a search for objective truth, then the correctness of these foundational axioms is essential. Axiomatic systems, such as ZFC, turn our mathematical intuitions into precise statements. Hence, their validity ultimately depends on whether these intuitions are correct. Further, mathematics will always have meaningful questions that cannot be resolved on the basis of currently accepted axioms (as shown by the MRDP theorem [3], which proves Hilbert’s 10th problem to be unsolvable, stating that no consistent formal system can prove all true statements of the form “

\forall \bar{x} \in ω^{m} (F (\bar{x}) \neq 0)

” for

F \in Z [\bar{X}]

). The desire to move beyond these limitations naturally motivates the invention of new axioms, which can only be justified using informal arguments. In this way, formal mathematics rests atop informal mathematics.

Of course, peoples’ mathematical intuitions do not always agree. A prime example of this is the Axiom of Choice, first articulated by Zermelo in 1904. Most mathematicians find this axiom to be self-evident. Indeed, prior to its codification as part of ZFC, mathematicians had already been using it implicitly in their proofs (including even strong opponents of the axiom like Lebesgue [4]). Yet, this axiom has been the subject of rich controversy due to its non-constructive nature and counter-intuitive consequences. Further, accepting (or not accepting) the Axiom of Choice has significant implications for many areas of mathematics. Hence, it shows that the question of whether to accept a foundational principle can be both subtle and important. For a standard reference on the Axiom of Choice, as well as the other ZFC axioms, see [5].

Compared with other scientists, it may be especially important for mathematicians to take an active interest in the foundations of their subject. Since many mathematical statements cannot be empirically tested, incorrect mathematics will not necessarily “self-correct” as an incorrect theory of physics might. Therefore, the validity of mathematics rests on our willingness to critique our own basic assumptions. In this paper, we study two axioms that, while highly intuitive, are incompatible with ZFC. These axioms stem from simple, intuitive arguments concerning probability, and we believe they merit the consideration of anyone interested the foundations of mathematics.

2.2. Mathematical Background

The probability axioms explored in this article bear a close connection to Freiling’s famous Axiom of Symmetry [6], concerning functions from

[0, 1]

to countable subsets of

[0, 1]

:

Axiom 1

(cf. Axiom

A_{ℵ_{0}}

from [6]). If

f : [0, 1] \to P_{\leq ℵ_{0}} ([0, 1])

, then there exist

x, y \in [0, 1]

such that

x \notin f (y)

and

y \notin f (x)

.

Freiling motivates this axiom as follows: Suppose we choose

x, y \in [0, 1]

uniformly at random. Since

f (y)

is a countable set, it is almost certain that

x \notin f (y)

. By symmetry, it is almost certain that

y \notin f (x)

. Hence, for a randomly chosen pair,

x, y

, it is almost certain that both

x \notin f (y)

and

y \notin f (x)

; and we conclude that some pair must satisfy this condition. Incredibly, under ZFC, this axiom is equivalent to the negation of the continuum hypothesis.

We are not the first to observe that this type of reasoning also has the potential to contradict ZFC itself. If we make the (plausible) assumption that any subset of

[0, 1]

with cardinality

< 2^{ℵ_{0}}

must have probability 0, we might suggest the following stronger principle:

Axiom 2

(cf. Axiom

A_{< 2^{ℵ_{0}}}

from [6]). If

f : [0, 1] \to P_{< 2^{ℵ_{0}}} ([0, 1])

, then there exist

x, y \in [0, 1]

such that

x \notin f (y)

and

y \notin f (x)

.

However, as Freiling notes, this axiom implies that

[0, 1]

cannot be well-ordered (assuming the other ZF axioms). Ultimately, Freiling rejects Axiom 2, saying that he knows of no compelling reason to support the belief that small sets have probability 0.

A second way in which Freiling’s intuition may be used to argue against AC is explored in [7]. Van Lambalgen’s approach is to add a new “almost all” quantifier, Q, to the language of set theory. Among van Lambalgen’s axioms for this quantifier is the following, which directly formalizes Freiling’s symmetry argument:

Axiom 3

(cf. Axiom Q6 from [7]). For any formula

ϕ

,

Q x Q y ϕ \leftrightarrow Q y Q x ϕ

.

In [7], van Lambalgen shows that his axioms for Q, together with ZF, imply that

2^{ω}

is not well-orderable.

The results of this paper show further ways in which Freiling’s reasoning can conflict with ZFC. We note two important ways in which our arguments differ from those just mentioned. First, both Freiling and van Lambalgen derive a contradiction with ZFC by arguing that a set equinumerous with the continuum cannot be well-ordered. By contrast, our contradictions require only the following (even weaker) special case of AC.

Axiom 4

(Weak Axiom of Choice). Every partition of

2^{ω}

into countable sets has a set of representatives.

Second, we will reconsider the usual axioms of probability. We will argue that some of Kolmogorov’s axioms do not have a clear justification, and that this calls into question any argument against ZFC based on them. We will then present a more parsimonious framework for probability and derive our contradictions within that framework.

2.3. Freiling’s Argument for $\neg C H$

We are greatly indebted to Freiling for his innovative work on the continuum hypothesis. In [6], he proposed a negative solution to CH based on intuitive probability axioms, such as Axiom 1. We view our axioms as natural extensions of his.

The eventual aim of this paper is to show that Freiling’s reasoning (taken to its logical conclusion) is incompatible with ZFC, which he assumes in his argument. Therefore, we believe that Freiling’s approach to CH does not ultimately work. Nonetheless, his argument is delightful and we include a version of it here for context. The remainder of this section is independent of the rest of the paper and may be skipped without any loss of understanding.

Theorem 1

(Adapted from [6]). For all

n < ω

, there exists

A_{n} \subseteq ω_{n}^{n + 2}

such that:

(i): For all $\bar{x} \in ω_{n}^{n + 1}$ , $| {y \in ω_{n} : {\bar{x}}^{⌢} y \in A_{n}} | < \infty$
(ii): $⋃_{σ \in Sym (n + 2)} σ \cdot A_{n} = ω_{n}^{n + 2}$

Proof.

We proceed by induction. For

n = 0

, we may take

A_{0} = {(m, n) \in ω^{2} : n \leq m}

. Now, assume the result holds for

n - 1

. For each

α < ω_{n}

, there exists an injection

f_{α} : ω_{n - 1} \to ω_{n}

such that

α + 1 \subseteq im (f_{α})

. Let

A_{n} = {(α, f_{α} (β_{0}), \dots, f_{α} (β_{n})) : α < ω_{n} \land (β_{0}, \dots, β_{n}) \in A_{n - 1}}

. First, let

G \leq Sym (n + 2)

be the stabilizer of 0. Then it follows from the inductive hypothesis that

⋃_{σ \in G} σ \cdot A_{n} \supseteq {(α, γ_{0} \dots, γ_{n}) \in ω_{n}^{n + 2} : \forall i (α \geq γ_{i})}

. Therefore,

⋃_{σ \in Sym (n + 2)} σ \cdot A_{n} = ω_{n}^{n + 2}

as desired. □

The following is adapted from [6]:

Axiom 5.

Let

I = [0, 1]

(or

2^{ω}

). For every positive integer, n, and for any

A \subseteq I^{n}

the following cannot simultaneously be true:

(i): For all $\bar{x} \in I^{n - 1}$ , $| {y \in 2^{ω} : {\bar{x}}^{⌢} y \in A} | < \infty$
(ii): $⋃_{σ \in Sym (n)} σ \cdot A = I^{n}$

The justification for Axiom 5 is similar to that given for Axiom 1. Suppose we choose

(x_{0}, \dots, x_{n - 1}) \in I^{n}

by choosing each

x_{i}

uniformly at random and independently. If we choose

x_{0}, \dots, x_{n - 2}

first, then by (i), it is almost certain that

x_{n - 1}

will be such that

(x_{0}, \dots, x_{n - 1}) \notin A

. Yet, the order in which we choose the

x_{i}

should be irrelevant, so we conclude that

(x_{0}, \dots, x_{n - 1})

is almost certainly not in A, period. By symmetry, for any

σ \in Sym (n)

,

(x_{σ (0)}, \dots, x_{σ (n - 1)})

is almost certain not to lie in A. It follows that

(x_{0}, \dots, x_{n - 1})

is almost certain not to lie in

⋃_{σ \in Sym (n)} σ \cdot A

, which implies this union does not equal

I^{n}

.

Corollary 1.

ZFC + Axiom 5 implies

2^{ℵ_{0}} \geq ℵ_{ω + 1}

Proof.

Suppose there is a bijection between

2^{ω}

and

ℵ_{n}

, for some

n < ω

. This, together with Theorem 1, implies that there is a counter-example to Axiom 5 (for the

n + 2

case). Hence,

2^{ℵ_{0}} \geq ℵ_{ω}

. By Konig’s theorem (a standard result in set theory [5]),

2^{ℵ_{0}}

has uncountable cofinality. Hence,

2^{ℵ_{0}} \neq ℵ_{ω}

. □

2.4. Revisiting Kolmogorov’s Axioms

If we wish to use probability as a source of foundational principles, as Freiling does, we are obliged to examine the assumptions underlying probability theory itself. Modern probability was formalized in the 1930’s by Kolmogorov, using the concepts of measure theory. Despite its great success, the measure-theoretic framework is not beyond scrutiny. In particular, we believe that Kolmogorov’s axioms cannot be reasonably deduced from first principles. According to Kolmogorov, a probability space is a triple

(X, Σ, P)

, where X is a set,

Σ \subseteq P (X)

is a

σ

-algebra and

P : Σ \to R

is a (

σ

-additive) measure satisfying

P (X) = 1

(for more on probability spaces, see [8]).

Kolmogorov makes two significant assumptions that we believe are not well justified. The first is that probabilities are elements of

R

, rather than some other ordered field extending

Q

, such as a hyperreal field (this is explored in [9]). The modern conception of

R

was motivated by the desire to give a rigorous foundation to analysis (which was primarily developed for physics). Thus, the applicability of real numbers to probability is by no means self-evident. In fact, there are examples of events which prima facie appear to require nonzero, infinitesimal probabilities; an impossibility in Archimedean fields such as

R

. Hence, if real numbers are well-suited to probability, this ought to be argued for, rather than taken as a hypothesis.

Secondly, we can examine Kolmogorov’s assumptions regarding the structure of measurable sets. It is a point in his favor that he does not assume that all subsets of X have defined probabilities (a reasonable precaution given results such as the Banach–Tarski theorem [10]). However, it would then be logical to call a set “measurable” precisely when its probability can be calculated using some set of basic assumptions (e.g., additivity for disjoint sets).

On the other hand, there seems to be no clear reason why measurable sets should be closed under unions and intersections. There is no general method for assigning a probability to the union or intersection of two events in terms of their individual probabilities. Hence, the assumption that measurable sets form an algebra (let alone a

σ

-algebra) needlessly constrains the universe of allowed probability spaces, and may exclude some desirable spaces

There are clear practical advantages to having measurable sets be closed under unions and intersections. However, we have good reason not to accept axioms simply because they are plausible and convenient. In the case of

R^{3}

, it is quite reasonable to think that ordinary volume ought to be invariant under the operation of breaking a set into two pieces and applying a Euclidean transformation to one of those pieces such that it remains disjoint from the other. Hence, one may be tempted to assert that the family of measurable sets be closed under this type of operation. Yet, it is well-known that (assuming AC) this cannot hold for any measure that assigns the usual volumes to all boxes. Such examples urge us to use extreme caution when asserting closure properties for measurable sets.

3. Minimalist Probability and New Axioms

3.1. Uniform Probability on $2^{ω}$

For the reasons given above, we choose for our basic framework a formulation of probability that differs somewhat from Kolmogorov’s. In particular, we will not insist that measurable sets form an algebra, nor will we insist that probabilities be real numbers. We will then show that the conflicts between ZFC and our probability axioms arise, even within this simplified framework. (Another approach to probability, which is more philosophically cautious than Kolmogorov’s, is qualitative probability. In this formulation, a “not more likely” relation on events, ≦, is axiomatized, rather than a probability function. For one possible axiomatization, see [11]. We note that our axioms and results can be adapted to this setting.)

Our definitions and axioms are motivated by a prototypical example: selecting a random element from

2^{ω}

by flipping a fair coin for each

n \in ω

; and associating 0 and 1 to heads and tails, respectively. We refer to this intuitive idea as the fair coin space.

For at least some

A \subseteq 2^{ω}

, the probability that a random element of the fair coin space lies in A seems to have an exact answer. For example, if

A = {x \in 2^{ω} : x_{0} = x_{3} = 0}

, then membership in A depends only on the outcomes of two of the coins and thus appears to be

\frac{1}{4}

.

In the remainder of this section, we present several new definitions and axioms, which are motivated by further considering the fair coin space and exploiting its symmetry (the coins are fair and identical) and independence (the coins do not influence each other).

Definition 1.

Let

(R, \leq)

be any total order extending

(Q, \leq)

. A minimalist probability space (MPS) is a pair,

Σ \subseteq P (2^{ω})

,

P : Σ \to R

, satisfying the conditions below.

(i): If $Δ \subseteq ω$ is finite, and $δ \in 2^{Δ}$ , then

$P ({z \in 2^{ω} : z ↾ Δ = δ}) = 1 / 2^{| Δ |} .$
(ii): If $A, B \in Σ$ and $A \subseteq B$ , then $P (A) \leq P (B)$ .

We refer to

2^{ω}

as the sample space, Σ as the event space and

P

as the probability function. Elements of Σ and

P (2^{ω}) \ Σ

are called measurable and non-measurable sets, respectively.

For convenience, we will avoid explicit mention of the set

Σ

when the meaning is still clear. Thus, expressions such as, “

P (A) = P (B)

”, should be understood to mean that either both sides of the equation are defined and equal or that they are both undefined.

Let

I \subseteq ω

and let

A = {x \in 2^{ω} : x ↾ I \in A_{0}}

for some

A_{0} \subseteq 2^{I}

. Given

x \in 2^{ω}

, whether or not

x \in A

depends only on the restriction of x to I. Therefore, we may reasonably identify “the probability that a random element of

2^{I}

is in

A_{0}

” with the “probability that a random element of

2^{ω}

is in A. In fact, with a slight abuse of notation, we will write

P (A_{0})

to mean

P (A)

.

Notation 1.

Let

I \subseteq ω

, and let

A \subseteq 2^{I}

. We will write

P (A)

as a shorthand for

P ({x \in 2^{ω} : x ↾ I \in A})

.

Notation 2.

Let

I, J \subseteq ω

be disjoint, and let

A \subseteq 2^{I \cup J}

. For

x \in 2^{I}

, we will use

A_{x}

to denote the set

{y \in 2^{J} : x \cup y \in A}

.

Remark 1.

It will frequently be convenient to replace the set

2^{ω}

with

2^{N}

, where N is some countably infinite set. We will call the resulting space an MPS on

2^{N}

. Similarly, if I and J are countable sets, we can speak of an MPS on

2^{I} \times 2^{J}

, using the natural identification between this space and

2^{I ∐ J}

.

3.2. Two Axioms

We now introduce two new intuitively appealing axioms. In Section 4.1 and Section 4.2, we show that each of these is incompatible with Axiom 4.

Definition 2.

An MPS has the Freiling property if the following holds: Let I and J be disjoint subsets of ω. Let

A \subseteq 2^{I \cup J}

. If for every

x \in 2^{I}

,

P (A_{x}) = q

, then

P (A) = q

.

We believe that this definition, which is based on Freiling’s Axiom of Symmetry [6], ought to hold in the fair coin space. To argue this, we will apply a variant of Freiling’s own argument. Suppose I, J, A and q are as in Definition 2. Since the fair coin space is symmetric, it should not matter in what order we flip the coins when choosing a random element. Therefore, let us start with coins corresponding to elements of I, followed by those corresponding to elements of J. Once

x \in 2^{I}

has been determined, our hypothesis tells us that the probability that we will choose

y \in 2^{J}

such that

x \cup y \in A

is exactly q. This is true regardless of the value of x, and hence we may say that it is true before x is chosen. Thus,

P (A) = q

as claimed.

Definition 3.

Let N be a countable set. We let

G (2^{N})

denote the group of transformations on

2^{N}

generated by functions,

σ : 2^{N} \to 2^{N}

, such that either

(i): there exists an $r \in 2^{N}$ such that $σ (x) = x \oplus r$ for all $x \in 2^{N}$ , where ⊕ denotes the bitwise XOR operation, or
(ii): there exists a permutation $π : N \to N$ such that $σ (x) = x \circ π$ for all $x \in 2^{N}$ .

A probability symmetry of

2^{N}

is an element of

G (2^{N})

. We let · denote the natural action of

G (2^{N})

on

2^{N}

.

Definition 4.

An MPS on

2^{ω}

has the dependent symmetry property if the following holds: Let

I \cup J

be any bipartition of ω; we will identify

2^{ω}

with

2^{I} \times 2^{J}

in the natural way. Then

P ([A \times B] \cup C) = P ([A \times σ \cdot B] \cup C)

, for all

A \subseteq 2^{I}

,

B \subseteq 2^{J}

,

C \subseteq \bar{A} \times 2^{J}

and

σ \in G (2^{J})

.

Again, we will argue that this definition ought to be satisfied by the fair coin space. Fix

σ \in G (2^{J})

. Consider randomly choosing an element

(x, y) \in 2^{I} \times 2^{J}

using two different methods. The first is just that of the fair coin space. For the second, suppose we choose x and y just as before, but if

x \in A

, we replace y with

σ^{- 1} \cdot y

(so the choice of whether to apply

σ^{- 1}

does not depend on y). From the point of view of probability, there ought to be no difference between these methods of choosing

(x, y)

. Hence, the probabilities that points chosen by each method lie in

(A \times B) \cup C

should be equal if they are defined. This motivates the equality in Definition 4.

Since we believe that the fair coin space is an MPS having the properties given in Definitions 2 and 4, we propose the following axioms.

Axiom 6.

There exists an MPS with the Freiling property.

Axiom 7.

There exists an MPS with the dependent symmetry property.

Both of these axioms assert the existence of sets with specific properties. Moreover, we have given informal arguments as to why these properties ought to be satisfiable. Hence, our axioms may be viewed as examples of the maximize principle explored by Maddy in [4]. This principle essentially states that the set theoretic universe ought to be very “full”, containing as many sets as possible without generating contradictions (e.g., as one gets by declaring

{x : x \notin x}

a set).

Remark 2.

The usual product measure on

2^{ω}

satisfies Definition 1 (though not Axioms 6 or 7). Hence, our arguments may be adapted to that setting.

4. Results

4.1. A Vitali-Type Paradox

In this and the following sections, we will assume the axioms of Zermelo–Fraenkel set theory without Choice or Replacement. When axioms beyond this are used, we will say so explicitly.

Lemma 1.

Let

(Σ, P)

be an MPS with the Freiling property. Let

Δ \subseteq ω

be finite, and let

A \subseteq 2^{ω}

have the following property:

For every $f : ω \ Δ \to 2$ , there exists a (unique) $δ_{f} : Δ \to 2$ ,
such that for all $g : Δ \to 2$ , $f \cup g \in A \Leftrightarrow g = δ_{f}$ .

Then

P (A) = 1 / 2^{| Δ |}

.

Proof.

This is immediate from Definitions 1 (i) and 2. □

Theorem 2.

Axiom 4 and Axiom 6 are incompatible.

Proof.

Let

(Σ, P)

be an MPS on

2^{ω} \times 2^{ω}

with the Freiling property. Let ∼ be the equivalence relation on

2^{ω}

given by

f \sim g \Leftrightarrow | {n \in ω : f (n) \neq g (n)} | < \infty .

(1)

By Axiom 4, there exists a set of representatives,

Γ \subseteq 2^{ω}

, for the equivalence classes of

2^{ω}

. We will view

ω

as a subset of

2^{ω}

by identifying elements of

ω

with their binary representations (padding with 0’s). Then for every

r \in 2^{ω}

, there are unique

r^{🟉} \in Γ

,

r^{'} \in ω

such that

r = r^{'} \oplus r^{🟉}

. Given

r \in 2^{ω}

and

m \in ω

, let

r_{m}

denote the

m^{t h}

bit of r. Let

A = {(m, n) \in ω \times ω : n_{m + 1} = 1}

and

B = {(m, n) \in ω \times ω : m_{n} = m_{n + 1} = 0} .

Let

Ω_{A} = {(r, s) \in 2^{ω} \times 2^{ω} : (r^{'}, s^{'}) \in A}

and

Ω_{B} = {(r, s) \in 2^{ω} \times 2^{ω} : (r^{'}, s^{'}) \in B}

.

Fix

r \in 2^{ω}

, and let

m = r^{'}

and

{(Ω_{A})}_{r} = {s \in 2^{ω} : (r, s) \in Ω_{A}}

. Then for any

s \in 2^{ω}

,

(r, s) \in Ω_{A}

iff

(m, s^{'}) \in A

iff

s_{m + 1}^{'} = 1

, so

{(Ω_{A})}_{r} = {s \in 2^{ω} : s_{m + 1} = s_{m + 1}^{🟉}}

. By Lemma 1,

P ({(Ω_{A})}_{r}) = \frac{1}{2}

. By Definition 2,

P (Ω_{A}) = \frac{1}{2}

. An analogous argument gives

P (Ω_{B}) = \frac{1}{4}

.

We will show that

A \subseteq B

. Let

(m, n) \in A

. Then

n_{m + 1} = 1

, which implies

n \geq 2^{m + 1}

. If

m_{k} = 1

for any

k \geq n

, then

m \geq 2^{k} \geq 2^{n} \geq 2^{2^{m + 1}}

, which is impossible. Hence,

m_{k} = 0

for all

k \geq n

; thus

m_{n} = m_{n + 1} = 0

, so

(m, n) \in B

. This implies

Ω_{A} \subseteq Ω_{B}

. Therefore,

P (Ω_{A}) \leq P (Ω_{B})

, a contradiction. □

4.2. The Banach–Tarski Paradox in $2^{ω}$

In this section, we let

F_{2} = F (a, b)

denote the free group on two generators. The following is an adaptation of the proof of the famous Banach–Tarski paradoxical decomposition [10].

Lemma 2

(Dependent Symmetries). Let

I \cup J

be a bipartition of ω. Let

(Σ, P)

be an MPS on

2^{ω} (\approx 2^{I} \times 2^{J})

with the dependent symmetry property. Let

A_{1}, \dots, A_{n} \subseteq 2^{I}

be pairwise disjoint, and let

B_{1}, \dots, B_{n} \subseteq 2^{J}

. If

σ_{i} \in G (2^{J})

for each

i \leq n

, then

P (⋃_{i = 1}^{n} A_{i} \times B_{i}) = P (⋃_{i = 1}^{n} A_{i} \times (σ_{i} \cdot B_{i}))

.

Proof.

This follows immediately from n applications of Axiom 7. □

Lemma 3.

There exists a bijection

π : F_{2} \to F_{2}

such that

π (e) = e

and for all

g, h \in F_{2} \ {e}

, there exists an integer

n > 1

such that

π (g^{n}) = h^{n}

.

Proof.

Let

{(g_{i}, h_{i}) : i < ω}

be an enumeration of

{(F_{2} \ {e})}^{2}

. First, note that for every

g \in F_{2} \ {e}

, the set

{g^{n} : n \in Z_{+}}

is infinite. For

i < ω

, recursively define

n_{i}

to be the least integer

\geq 2

such that

g_{i}^{n_{i}} \notin {g_{j}^{n_{j}} : j < i}

and

h_{i}^{n_{i}} \notin {h_{j}^{n_{j}} : j < i}

. Let

π^{'} : X \to X

be the partial function

π^{'} = {(e, e)} \cup {(g_{i}^{n_{i}}, h_{i}^{n_{i}}) : i < ω}

. Since

F_{2} \ {e, g_{i}^{n_{i}}, h_{i}^{n_{i}} : i < ω}

is infinite (e.g., it contains

a^{n} b

for all n), we can extend

π^{'}

to a bijection

π : F_{2} \to F_{2}

. □

Lemma 4.

Let G be a group acting on a set, X. Let

Y = {x \in X : | G_{x} | = 1}

. Then G fixes Y as a set, and G acts freely on Y.

Proof.

Routine. □

Lemma 5

(Axiom 4). There exist sets

A, B \subseteq 2^{F_{2}}

and

α, β, γ \in G (2^{F_{2}})

such that

(i): $γ \cdot (A^{C} \cap B^{C}) \subseteq A \cup B$ ,
(ii): $β^{m} \cdot A \cap β^{n} \cdot A = \emptyset$ for all $m \neq n \in ω$ and
(iii): $α^{m} \cdot B \cap α^{n} \cdot B = \emptyset$ for all $m \neq n \in ω$

Proof.

By Definition 3, every permutation

π \in Sym (F_{2})

, induces a probability symmetry of

2^{F_{2}}

, given by

f \mapsto f \circ π^{- 1}

. Therefore, there is a natural embedding

Sym (F_{2}) ↪ G (2^{F_{2}})

. Moreover, via Cayley’s theorem, there is a natural embedding

F_{2} ↪ Sym (F_{2})

. Therefore, we have

F_{2} ↪ Sym (F_{2}) ↪ G (2^{F_{2}})

. We will identify

F_{2}

and

Sym (F_{2})

with their images in

G (2^{F_{2}})

. We let

F_{2}

act on

2^{F_{2}}

using this identification.

Let

S = {y \in 2^{F_{2}} : \exists h \in F_{2} \ {e} (h \cdot y = y)}

and let

S^{C} = 2^{F_{2}} \ S

. By Lemma 4,

F_{2}

acts freely on

S^{C}

. Let

Γ

be a set of representatives for the

F_{2}

-orbits of

S^{C}

. Let

W_{c} \subseteq F_{2}

be the set of reduced words starting with c (for

c \in {a, b, a^{- 1}, b^{- 1}}

). Let

X_{a} = W_{a} \cup W_{a^{- 1}} \cup {e}

,

X_{b} = W_{b} \cup W_{b^{- 1}} \cup {e}

. Let

A = X_{a} \cdot Γ

and

B = X_{b} \cdot Γ

. Then

S = A^{C} \cap B^{C}

.

Let

π \in Sym (F_{2}) \leq G (2^{F_{2}})

be as in Lemma 3; let

ϕ : 2^{F_{2}} \to 2^{F_{2}}

be the map that flips the

e^{t h}

bit. Let

γ = ϕ \circ π \in G (2^{F_{2}})

. For any

y \in S

, we have

g \cdot y = y

for some

g \in F_{2} \ {e}

. Fix

h \in F_{2} \ {e}

. By Lemma 3,

π (g^{n}) = h^{n}

for some

n > 1

. Hence,

(h^{n} γ \cdot y) (e) = (γ \cdot y) (h^{n}) = (π \cdot y) (h^{n}) = y (g^{n}) = y (e) = (π \cdot y) (e) \neq (γ \cdot y) (e)

. Therefore,

h^{n} \cdot (γ \cdot y) \neq γ \cdot y

, so h does not fix

γ \cdot y

. Since h was arbitrary,

γ \cdot y \in S^{C}

. Since

y \in S

was arbitrary,

γ \cdot S \subseteq S^{C} = A \cup B

.

Let

α, β

be

a, b

, respectively. To complete the proof, it is sufficient (by symmetry) to show that

b^{m} \cdot A \cap b^{n} \cdot A = \emptyset

for all

m \neq n \in ω

. Suppose

y \in b^{m} \cdot A \cup b^{n} \cdot A

. Then

y = (b^{m} g_{1}) \cdot z_{1} = (b^{n} g_{2}) \cdot z_{2}

, for some

z_{1}, z_{2} \in Γ

and

g_{1}, g_{2} \in X_{a}

. Then

z_{1}

and

z_{2}

are in the same

F_{2}

-orbit of

S^{C}

, and so in fact

z_{1} = z_{2} = z

. We now have

(g_{2}^{- 1} b^{m - n} g_{1}) \cdot z = z

, which implies

g_{2}^{- 1} b^{m - n} g_{1} = e

, since

z \in S^{C}

. Thus,

b^{m - n} = g_{2} g_{1}^{- 1} \in X_{a}

, which yields

m = n

. □

Theorem 3.

Axioms 4 and 7 are incompatible.

Proof.

Let

x, y, z

be three elements not in

F_{2}

, and let

(Σ, P)

be an MPS on

2^{{x, y, z}} \times 2^{F_{2}}

with the dependent symmetry property. Let

A, B \subseteq 2^{F_{2}}

and

α, β, γ \in G (2^{F_{2}})

be as in Lemma 5; let

S = {(A \cup B)}^{C}

. Furthermore, let

S_{A} = S \cap γ^{- 1} \cdot A

and

S_{B} = S \cap γ^{- 1} \cdot B

. Let

(i): $P_{0} = S_{A}$ , $P_{1} = S_{B} \ S_{A}$ , $P_{2} = A$ , $P_{3} = B \ A$
(ii): $Q_{0} = β γ \cdot S_{A}$ , $Q_{1} = α γ \cdot (S_{B} \ S_{A})$ , $Q_{2} = β^{2} \cdot A$ , $Q_{3} = α^{2} \cdot (B \ A)$
(iii): $R_{0} = β^{3} γ \cdot S_{A}$ , $R_{1} = α^{3} γ \cdot (S_{B} \ S_{A})$ , $R_{2} = β^{4} \cdot A$ , $R_{3} = α^{4} \cdot (B \ A)$

Note that

P_{0} \cup P_{1} \cup P_{2} \cup P_{3} = 2^{F_{2}}

and

Q_{0} \cup Q_{1} \cup Q_{2} \cup Q_{3} \cup R_{0} \cup R_{1} \cup R_{2} \cup R_{3} \subseteq F_{2}

are both pairwise disjoint unions. Using Lemma 5, we have the following:

\frac{1}{4} = P (2^{{x} ⌢} 00^{⌢} 2^{F_{2}}) = P ([000^{⌢} 2^{F_{2}}] \cup [100^{⌢} 2^{F_{2}}]),

by Definition 1 (i). Partitioning the set, this is equal to

P ([000^{⌢} P_{0}] \cup [000^{⌢} P_{1}] \cup [000^{⌢} P_{2}] \cup [000^{⌢} P_{3}] \cup

[100^{⌢} P_{0}] \cup [100^{⌢} P_{1}] \cup [100^{⌢} P_{2}] \cup [100^{⌢} P_{3}]),

which, by Lemma 2, equals

P ([000^{⌢} P_{0}] \cup [001^{⌢} P_{1}] \cup [010^{⌢} P_{2}] \cup [011^{⌢} P_{3}] \cup

[100^{⌢} P_{0}] \cup [101^{⌢} P_{1}] \cup [110^{⌢} P_{2}] \cup [111^{⌢} P_{3}]) .

By Lemma 2 again, this is

P ([000^{⌢} Q_{0}] \cup [001^{⌢} Q_{1}] \cup [010^{⌢} Q_{2}] \cup [011^{⌢} Q_{3}] \cup

[100^{⌢} R_{0}] \cup [101^{⌢} R_{1}] \cup [110^{⌢} R_{2}] \cup [111^{⌢} R_{3}]),

which is further equal to

P ([000^{⌢} Q_{0}] \cup [000^{⌢} Q_{1}] \cup [000^{⌢} Q_{2}] \cup [000^{⌢} Q_{3}] \cup

[000^{⌢} R_{0}] \cup [000^{⌢} R_{1}] \cup [000^{⌢} R_{2}] \cup [000^{⌢} R_{3}]) .

By Definition 1 (iii), this is at most

P (000^{⌢} 2^{F_{2}})

, which equals

\frac{1}{8}

, by Definition 1 (i); a contradiction. □

5. Conclusions

We have shown that the Axiom of Choice is incompatible with highly intuitive assumptions about probability (Axiom 6 or Axiom 7). Our work improves on other similar results in two important ways. First, our contradictions rely on a weaker version of AC than that used in [6,7], which both assume the well-orderability of

R

. Second, we analyze the philosophical “weak points” in Kolmogorov’s axiomatization of probability, and show that our results do not depend on these.

While the standard resolution to these paradoxes is to reject Axioms 6 and 7, we believe that there are three plausible options:

(i): reject the Infinity or Powerset axioms, so $2^{ω}$ cannot be constructed,
(ii): reject Axiom 4, or
(iii): reject Axioms 6 and 7.

For those with finitistic leanings, option (i) will be the obvious choice, and our paradoxes may be viewed as consequences of invalid reasoning about “completed infinities”. For mathematicians who accept completed infinities, but believe that valid mathematical objects must be explicitly constructed, option (ii) may be appealing.

Finally, if we reject Axioms 6 and 7, we can keep all of ZFC. Unfortunately, these axioms seem to follow immediately from our most basic intuitions about random events. Therefore, the authors believe that rejecting them amounts to a rejection of randomness as a valid mathematical concept. This leads one to question whether probability can be meaningfully formalized at all.

Each of these attempts at a resolution has drawbacks, and it is not clear to the authors what approach is best. The aim of this paper has been to make these issues more widely understood, especially by mathematicians working outside of set theory. We hope that in so doing, we will stimulate an interesting conversation about the conflicts between probability, infinite sets, and the Axiom of Choice.

Author Contributions

Conceptualization, A.H.; writing—original draft preparation, A.H.; writing—review and editing, A.H. and J.C.; supervision, J.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

The authors would like to thank R. Koncel-Kedziorski, J. D. Hamkins, and the anonymous referees for their helpful feedback on the early drafts of this work.

Conflicts of Interest

The authors declare no conflict of interest.

References

Galvin, F. Problem 5348. Am. Math. Mon. 1965, 72, 1136. [Google Scholar]
Hardin, C.S.; Taylor, A.D. An introduction to infinite hat problems. Math. Intell. 2008, 30, 20–25. [Google Scholar] [CrossRef]
Matijasevic, J.V. Enumerable sets are diophantine. Soviet Math. Dokl. 1970, 11, 354–358. [Google Scholar]
Maddy, P. Believing the axioms. I. J. Symb. Log. 1988, 53, 481–511. [Google Scholar] [CrossRef] [Green Version]
Jech, T. Set Theory: The Third Millennium Edition, Revised and Expanded; Springer: Berlin, Germany, 2003. [Google Scholar]
Freiling, C. Axioms of symmetry: Throwing darts at the real number line. J. Symb. Log. 1986, 51, 190–200. [Google Scholar] [CrossRef]
Van Lambalgen, M. Independence, randomness and the axiom of choice. J. Symb. Log. 1992, 57, 1274–1304. [Google Scholar] [CrossRef] [Green Version]
Chung, K.L.; Zhong, K. A Course in Probability Theory; Academic Press: San Diego, CA, USA, 2001. [Google Scholar]
Hofweber, T.; Schindler, R. Hyperreal-valued probability measures approximating a real-valued measure. Notre Dame J. Form. Log. 2016, 57, 369–374. [Google Scholar] [CrossRef]
Banach, S.; Tarski, A. Sur la décomposition des ensembles de points en parties respectivement congruentes. Fund. Math. 1924, 6, 244–277. [Google Scholar] [CrossRef]
Villegas, C. On qualitative probability. Am. Math. Mon. 1967, 74, 661–669. [Google Scholar] [CrossRef]

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Herman, A.; Caughman, J. Probability Axioms and Set Theory Paradoxes. Symmetry 2021, 13, 179. https://doi.org/10.3390/sym13020179

AMA Style

Herman A, Caughman J. Probability Axioms and Set Theory Paradoxes. Symmetry. 2021; 13(2):179. https://doi.org/10.3390/sym13020179

Chicago/Turabian Style

Herman, Ari, and John Caughman. 2021. "Probability Axioms and Set Theory Paradoxes" Symmetry 13, no. 2: 179. https://doi.org/10.3390/sym13020179

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Probability Axioms and Set Theory Paradoxes

Abstract

1. A Puzzle

2. Introduction

2.1. Axioms and Mathematical Intuition

2.2. Mathematical Background

2.3. Freiling’s Argument for $\neg C H$

2.4. Revisiting Kolmogorov’s Axioms

3. Minimalist Probability and New Axioms

3.1. Uniform Probability on $2^{ω}$

3.2. Two Axioms

4. Results

4.1. A Vitali-Type Paradox

4.2. The Banach–Tarski Paradox in $2^{ω}$

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Probability Axioms and Set Theory Paradoxes

Abstract

1. A Puzzle

2. Introduction

2.1. Axioms and Mathematical Intuition

2.2. Mathematical Background

2.3. Freiling’s Argument for ¬ C H

2.4. Revisiting Kolmogorov’s Axioms

3. Minimalist Probability and New Axioms

3.1. Uniform Probability on 2 ω

3.2. Two Axioms

4. Results

4.1. A Vitali-Type Paradox

4.2. The Banach–Tarski Paradox in 2 ω

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

2.3. Freiling’s Argument for $\neg C H$

3.1. Uniform Probability on $2^{ω}$

4.2. The Banach–Tarski Paradox in $2^{ω}$