Axiom of Choice

“The Axiom of Choice is necessary to select a set from an infinite number of pairs of socks, but not an infinite number of pairs of shoes.” — Bertrand Russell, Introduction to mathematical philosophy

“Tarski told me the following story. He tried to publish his theorem [the equivalence between the Axiom of Choice and the statement ‘every infinite set A has the same cardinality as AxA’] in the Comptes Rendus Acad. Sci. Paris but Fréchet and Lebesgue refused to present it. Fréchet wrote that an implication between two well known propositions is not a new result. Lebesgue wrote that an implication between two false propositions is of no interest. And Tarski said that after this misadventure he never again tried to publish in the Comptes Rendus.”

  • Jan Mycielski, A System of Axioms of Set Theory for the Rationalists

Obligatory Introduction

The Axiom of Choice, the most controversial axiom of the 20th Century.

The axiom states that a certain kind of function, called a `choice’ function, always exists. It is called a choice function, because, given a collection of non-empty sets, the function ‘chooses’ a single element from each of the sets. It is a powerful and useful axiom, asserting the existence of useful mathematical structures (such as bases for vector spaces of arbitrary dimension, and ultraproducts). It is a generally accepted axiom, and is in wide use by mathematicians. In fact, according to Elliott Mendelson in Introduction to Mathematical Logic (1964) “The status of the Axiom of Choice has become less controversial in recent years. To most mathematicians it seems quite plausible and it has so many important applications in practically all branches of mathematics that not to accept it would seem to be a wilful hobbling of the practicing mathematician. ”

Neverless, being an axiom, it cannot be proven and must instead be assumed. In particular, it is an axiom of set theory and it is not provable from the other axioms (the Zermelo-Fraenkel axioms of Set Theory). In fact many mathematicians, in particular constructive mathematicians, reject the axiom, stating that it does not capture a ‘real’ or ‘physical’ property, but is instead just a mathematical oddity, an artefact of the mathematics used to approximate reality, rather than reality itself. In the words of the LessWrong community: the constructive mathematicians would claim it is a statement about the map, and not the territory.

Historically, the axiom has experienced much controversy. Before it was shown to be independent of the other axioms, it was believed either to follow from them (i.e., be ‘True’) or lead to a contradiction (i.e., be ‘False’). Its independence from the other axioms was, in fact, a very surprising result at the time.

Getting the Heavy Maths out the Way: Definitions

Intuitively, the axiom of choice states that, given a collection of non-empty sets, there is a function which selects a single element from each of the sets.

More formally, given a set \(X\) whose elements are only non-empty sets, there is a function

$$ f: X \rightarrow \bigcup_{Y \in X} Y $$
from \(X\) to the union of all the elements of \(X\) such that, for each \(Y \in X\), the image of \(Y\) under \(f\) is an element of \(Y\), i.e., \(f(Y) \in Y\).

In logical notation,

$$ \forall_X \left( \left[\forall_{Y \in X} Y \not= \emptyset \right] \Rightarrow \left[\exists \left( f: X \rightarrow \bigcup_{Y \in X} Y \right) \left(\forall_{Y \in X} \exists_{y \in Y} f(Y) = y \right) \right] \right) $$

Axiom Unnecessary for Finite Collections of Sets

For a finite set \(X\) containing only finite non-empty sets, the axiom is actually provable (from the Zermelo-Fraenkel axioms of set theory ZF), and hence does not need to be given as an axiom. In fact, even for a finite collection of possibly infinite non-empty sets, the axiom of choice is provable (from ZF), using the axiom of induction. In this case, the function can be explicitly described. For example, if the set \(X\) contains only three, potentially infinite, non-empty sets \(Y_1, Y_2, Y_3\), then the fact that they are non-empty means they each contain at least one element, say \(y_1 \in Y_1, y_2 \in Y_2, y_3 \in Y_3\). Then define \(f\) by \(f(Y_1) = y_1\), \(f(Y_2) = y_2\) and \(f(Y_3) = y_3\). This construction is permitted by the axioms ZF.

The problem comes in if \(X\) contains an infinite number of non-empty sets. Let’s assume \(X\) contains a countable number of sets \(Y_1, Y_2, Y_3, \ldots\). Then, again intuitively speaking, we can explicitly describe how \(f\) might act on finitely many of the \(Y\)s (say the first \(n\) for any natural number \(n\)), but we cannot describe it on all of them at once.

To understand this properly, one must understand what it means to be able to ‘describe’ or ‘construct’ a function \(f\). This is described in more detail in the sections which follow. But first, a bit of background on why the axiom of choice is interesting to mathematicians.

Controversy: Mathematicians Divided! Counter-Intuitive Results, and The History of the Axiom of Choice

Mathematicians have been using an intuitive concept of a set for probably as long as mathematics has been practiced. At first, mathematicians assumed that the axiom of choice was simply true (as indeed it is for finite collections of sets).

Georg Cantor introduced the concept of transfinite numbers and different cardinalities of infinity in a 1874 paper (which contains his infamous Diagonalization Argument) and along with this sparked the introduction of set theory. In 1883, Cantor introduced a principle called the ‘Well-Ordering Princple’ (discussed further in a section below) which he called a ‘law of thought’ (i.e., intuitively true). He attempted to prove this principle from his other principles, but found that he was unable to do so.

Ernst Zermelo attempted to develop an axiomatic treatment of set theory. He managed to prove the Well-Ordering Principle in 1904 by introducing a new principle: The Principle of Choice. This sparked much discussion amongst mathematicians. In 1908 published a paper containing responses to this debate, as well as a new formulation of the Axiom of Choice. In this year, he also published his first version of the set theoretic axioms, known as the Zermelo Axioms of Set Theory. Mathematicians, Abraham Fraenkel and Thoralf Skolem improved this system (independently of each other) into its modern version, the Zermelo Fraenkel Axioms of Set Theory.

In 1914, Felix Hausdorff proved Hausdorff’s paradox. The ideas behind this proof were used in 1924 by Banach and Alfred Tarski to prove the more famous Banach-Tarski paradox (discussed in more detail below). This latter theorem is often quoted as evidence of the falsehood of the axiom of choice.

Between 1935 and 1938, Kurt Gödel proved that the Axiom of Choice is consistent with the rest of the ZF axioms.

Finally, in 1963, Paul Cohen developed a revolutionary mathematical technique called forcing, with which he proved that the axiom of choice could not be proven from the ZF axioms (in particular, that the negation of AC is consistent with ZF). For this, and his proof of the consistency of the negation of the Generalized Continuum Hypothesis from ZF, he was awarded a fields medal in 1966.

This axiom came to be accepted in the general mathematical community, but was rejected by the constructive mathematicians as being fundamentally non-constructive. However, it should be noted that in many forms of constructive mathematics, there are provable versions of the axiom of choice. The difference is that in general in constructive mathematics, exhibiting a set of non-empty sets (technically, in constructive set-theory, these should be ‘inhabited’ sets) also amounts to exhibiting a proof that they are all non-empty, which amounts to exhibiting an element for all of them, which amounts to exhibiting a function choosing an element in each. So in constructive mathematics, to even state that you have a set of inhabited sets requires stating that you have a choice function to these sets proving they are all inhabited.

Some explanation of the history of the axiom of choice (as well as some of its issues) can be found in the paper “100 years of Zermelo’s axiom of choice: what was the problem with it?” by the constructive mathematician Per Martin-Löf at this webpage.

(Martin-Löf studied under Andrey Kolmogorov of Kolmogorov complexity and has made contributions to information theory, mathematicalstatistics, and mathematical<em>logic, including developing a form of intuitionistic type theory).

A nice timeline is also summarised on Stanford Encyclopaedia of Logic.

So, What is this Choice Thing Good for Anyways?

The Axiom of Choice is in common use by mathematicians in practice. Amongst its many applications are the following:

Non-Empty Products

This is the statment that taking the mathematical product of non-empty sets will always yield a non-empty set. Consider infinitely many sets \(X_1, X_2, X_3, \ldots\) indexed by the natural numbers. Then an element of the product \(\prod_{i \in \mathbb{N}} X_i\) is a family (essentially an infinite tuple) of the form \((x_1, x_2, x_3, \ldots )\) where \(x_1 \in X_1\), \(x_2 \in X_2\) and so on. However, to select such a tuple in the product amounts to selecting a single element from each of the sets. Hence, without the axiom of choice, it is not provable that there are any elements in this product.

Note, that without the axiom, it is not necessarily the case that the product is empty. It just isn’t provable that there are any elements. However, it is consistent that there exists some product that is empty. Also, bear in mind that even without the axiom of choice, there are products of this form which are non-empty. It just can’t be shown that they are non-empty in general.

However, it does seem somewhat counterintuitive that such a product be non-empty. In general, given two non-empty sets, \(X_1\) and \(X_2\), the product is at least as large as either of the sets. Adding another non-empty set \(X_3\) usually makes the product larger still. It may seem strange, then, that taking the limit of this procedure could result in an empty structure.

Existence of a basis for any (esp. infinite dimensional) vector space

This example is discussed in more detail in the next section.

A vector space can be intuitively thought of as a collection of vectors which themselves can be thought of as arrows (or the information of a ‘direction’ and a ‘magnitude’).

A vector space has a number of ‘directions’ in which the vectors can point, referred to as its vector space dimension. For a finite-dimensional vector space, it is possible to find a basis consisting of vectors. The property of a basis is that:

  • any vector in the space can be built up as a combination of the vectors in the basis

  • none of the vectors in the basis can be built up as a combination of the rest of the vectors in the basis.

For finite-dimensional vector spaces, such a basis is finite and can be found. However, for infinite-dimensional vector spaces, a way of finding a basis does not always exist. In fact, if the axiom of choice is false, then there is an infinite-dimensional vector space for which it is impossible to find a basis.

Brouwer’s Fixed-Point Theorem: the existence of a fixed point for a function from a “nice” shape to itself

Any continuous functions \(f: C \rightarrow C\) from a closed disk \(C\) onto itself has a fixed point \(x_0\). (In full generality, \(C\) may be any convex compact set).

In other words, there is at least one point \(x_0 \in C\) such that \(f(x_0) = x_0\).

This works also for rectangles, and even in multiple dimensions. Hence, if true for real world objects, this theorem has the following somewhat surprising consequences:

  • Take two identical sheets of graph with marked coordinates. Crumple up sheet B and place the crumpled ball on top of sheet A. Then, there is some coordinate \((x , y)\) (where \(x\) and \(y\) are real numbers, not necessarily integers), such that this coordinate on sheet B is directly above that coordinate on sheet A.

  • Take a cup of tea. Stir it and let it settle. There is some point of the tea which ends up in the same place it started.

  • Take a map of France. Place it on the ground in France. Take a pin. There is a point on the map through which you could stick the pin and the pin will also stick into the ground at the point represented on the map.

Existence of ultrafilters and hence ultraproducts

The following example is somewhat technical. An attempt is made to describe it very roughly.

Given an indexing set \(I\), and a collection of mathematical structures \((A_i)_{i \in I}\) (of a certain type) indexed by \(I\). (For example, let \(I\) be the natural numbers \(\mathbb{N}\) and let each of the mathematical structures \(A_n\) be numbered).

An ultrafilter \(\mathcal{U}\) on \(I\) is a collection of subsets of \(I\) of a special type. Intuitively it should be thought of as a collection of `big subsets’ of \(I\). It is possible to form the set of all cofinite subsets of \(I\) without the axiom of choice, and \(\mathcal{U}\) should contain at least these. However, for mathematical reasons, \(\mathcal{U}\) should also contain ‘as many sets as possible’. However, in order to do so, there are some ‘arbitrary choices’ that have to be made. This is where the axiom of choice comes in.

One of the applications of ultrafilters is ultraproducts. For each subset \(X \subseteq I\) such that \(X \in \mathcal{U}\) there is a subcollection \((A_i)_{i \in X}\) of \((A_i)_{i \in I}\). Call such a subcollection a “large collection”. The ultraproduct \(A\) is a structure that captures the properties of large collections of the \(A_i\)s, in the sense that a statement of first order logic is true of the ultraproduct \(A\) if and only if it is true of some large collection of the \(A_i\)s.

Now, any statement that is either true for cofinitely many \(A_i\)s or false for cofinitely many \(A_i\)s will be true or false respectively for \(A\). But what about the other statements? This is where the arbitrary choices come in. Each statement needs to be either true or false of \(A\), and we use the axiom of choice to form an ultrafilter that does that for us.

One basic example of an application of ultrafilters is forming the nonstandard real numbers.

Further examples of applications of the axiom of choice may be found on the Wikipedia page here and here.

Physicists Hate Them! Find out How Banach and Tarski Make Infinity Dollars with this One Simple Trick!

One of the most counter-intuitive consequences of the Axiom of Choice is the Banach-Tarski Paradox. It is a theorem provable using the Zermelo-Fraenkel axioms along with the axiom of choice.

This theorem was proven in a 1924 paper by Stefan Banach and Alfred Tarski.

Intuitively, what the theorem says is that it is possible to take a ball, cut it into five pieces, rotate and shift these pieces and end up with two balls.

Now, there are some complications, including the fact the pieces themselves are infinitely complex, and the have to pass through each other when they are being shifted. There is no way a practical implementation of this theorem could be developed. Nevertheless, that the volume of a ball could be changed just by cutting, rotating and shifting seems highly counter-intuitive.

A suprisingly good video explanation in laymans terms by Vsauce can be found Here.

The video Infinity shapeshifter vs. Banach-Tarski paradox by Mathologor advertises itself as a prequel to the above video, which puts you ‘in the mindset’ of a mathematician, so-to-speak, and makes the result a bit less surprising.

This theorem is the main counterexample used as evidence of the falsehood of the Axiom of Choice. If not taken as evidence of its falsehood, thsi is at least used as evidence of the counter-intuitiveness of AC.

Q: What’s an Anagram of Banach-Tarski? A: Banach-Tarski Banach Tarski

How Something Can Exist Without Actually Existing: The Zermelo Fraenkel Axioms and the Existence of a Choice Function

The Zermelo-Fraenkel axioms of Set Theory (ZF) introduce a fundamental concept, called a set, and a relation between sets, called the element of relation (usally written \(\in\)) where \(x \in X\) should be interpreted as ‘$x$ is contained inside \(X\) in the way that an item is contained inside a box’. There are then a number of axioms imposed on these fundamental objects and this relation.

What one must remember is that theorems derived from these axioms are merely statements of the form that ‘if one has a system which satisfies these laws (even if, for example \(\in\) is interpreted as something entirely different from being contained inside something), then it must also satisfy the statements of the theorems’. However, its general use is to imagine sets as being something like boxes which contain mathematical objects. Moreover, almost any statement of mathematics can be stated in terms of sets, where the mathematical objects in question become sets of a certain kind. In this way, since the mathematical objects in question are set up to satisfy the axioms, then anything which can be derived from these axioms will also hold for the mathematical objects.

In particular, a function can be interpreted as a specific kind of set. In particular, it is a set of ordered pairs (more generally, ordered n-tuples, each of which can itself be interpreted as a specific kind of set) satisfying a specific property.

There are different ways of stating the same axioms (by separating or combining axioms, giving different formulations of the same axioms, or giving different axioms that are equivalent given the other axioms) hence what follows is only a specific formulation, namely, the Zermelo-Fraenkel one from Wikipedia.

The axioms begin by stating that two sets are the same if they have the same elements. Then the axiom regularity states sets are well-behaved in a certain way that’s not so important to us right now.

Now comes the part that is important for our purposes: The axiom schema of specification (actually a schema specifying infinitely many axioms, but we can pretend it is just one axiom for now). This is an axiom asserting the existence of certain sets. In a sense, it allows one to ‘create’ a new set out of an existing one. Namely, given a set \(X\) and a statement \(\phi\) of first order logic (a statement about sets of a specific, very formal form, and which uses only the \(\in\) symbol and the reserved symbols of logic), it is possible to create a set \(\{x \in X : \phi(x) \}\) of all of the elements of \(X\) for which the formula \(\phi\) is true.

For example, if we know (or assume) the set of all numbers \(\mathbb{N}\) exists, and we have some way of formalising the statement ‘$x$ is an even number’ as a first-order statement \(\phi(x)\), then the set of all even numbers exists.

Additionally the axioms of pairing and union, axiom schema of replacement, and axiom of power set are all of the form “Given that some sets \(A, B, C, \ldots\) exist, then some other set \(X\) also exists. The axiom of infinity simply states that an infinite set with certain properties exists.

Notice that all of the above are axioms. It is not expected that any of them be proven. They are simply assumptions that you make about whichever system you want to reason. Any theorems that you can prove from these axioms will then be true about your system. However, mathematicians generally agree that these axioms capture our intuitive notion about how “sets” of objects (or even concepts) should behave, and about which sets we are allowed to reason (which sets ‘exist’). Most of these (except maybe the axiom of infinity, and even that one possibly) seem to apply to our world and seem to work fine.

Now, the last axiom, the axiom of choice (or the well-ordering principle) asserts that a certain kind of function exists. It cannot be proven from the above. In other words, given a system that satisfies all of the above, it cannot be assumed that the system also satisfies this axiom (nor in fact that it does not satisfy this axiom). That’s all there is to it, really.

Yet, mathematicians do disagree about this axiom, and whether it applies to our world as well. Some mathematicians take a Platonic view of mathematics, in which mathematical objects such as sets actually exist in some abstract realm, and for which the axiom of choice is either true or false, but we do not know which. Others take a highly constructive view (in many cases motivated by realism and the ability of the mathematics to model the world) in which either the axiom of choice is false, or infinite sets do not exist in which case the axiom of choice is provable and hence superfluous. Others take the view that the axiom is not true or false, but merely useless, and that anything provable from it is meaningless. Many seem not to care: the axiom is convenient for the mathematics they wish to do (whose application they are not much concerned about in any case) and hence they assume it without qualm.

How Something can be Neither True nor False:

How Could we Possibly Know that AC is Independent of ZF?# It has been stated above multiple times that the Axiom of Choice is independent from the other Zermelo-Fraenkel axioms. In other words, it can neither be proven nor disproven from those axioms. But how can we possibly know this fact?

The answer lies in models. However, these are not physical or even computational models, but models of a more abstract kind.

For example, a model of group theory is a specific group, which itself can be characterized as a specific set. Now, notice that the axioms of group theory say nothing about whether a given group is abelian (commutative) or not. It does not follow from the axioms that for groups it is always true that \(xy = yx\), nor does it follow that for groups there are always some \(x\) and \(y\) for which \(xy \not= yx\). In other words, the “abelian axiom” is independent of the axioms of group theory. How do we know this fact? We need simply exhibit two models, two groups, one of which is abelian and the other not. For these groups, I pick, oh say the cyclic group on 3 elements and the symmetric group \(S_3\). The first is abelian, the second, not.

In order to reason about such models of set theory, one assumes the existence of “meta-sets” in some meta-theory. The entire “universe” of set theory is then a certain “meta-set” behaving in a certain way. In case this feels too much like cheating, this StackExchange answer should help clear things up. In particular, the following quote from VanLiere’s thesis:

″ Since these questions all have to do with first-order provability, we could take as our metatheory some very weak theory (such as Peano arithmetic) which is sufficient for formalizing first-order logic. However, as is customary in treatises about set theory, we take as our metatheory ZF plus the Axiom of choice in order to have at our disposal the infinitary tools of model theory. We will also use locutions such as … which are only really justifiable in some even stronger metatheory with the understanding that they could be eliminated through the use of Boolean-valued models or some other device. ”

In other words, it is possible to form these models in some weaker theory on which we have “more of a grip”. The entirity of set theory are then special objects satisfying the axioms of this weaker theory. Is it possible to repeat this process ad infinitum? No. But if we want, we could even deal with just a finite fragment of set theory: We assume that any mathematics we want to do only needs a finite number of the (infinitely many) axioms of set theory. We then prove what we want about this finite fragment. But we may as well have proved it about the whole theory.

Now, pick (or construct) two specific objects of the meta-theory such that in one of them, the axiom of choice is true, and that in the other, the axiom of choice is false.

To obtain these two models requires vastly different approaches which will not be described in detail here. More detail can be found online in Kunen’s text. The consistency of choice is the easier direction, and involves constructing something called Gödel’s constructible universe of sets (or just Gödel’s universe or the constructible universe). The consistency of the negation of choice is more difficult, and involves a technique developed by Paul Cohen called forcing.

A Rose by Any Other Name: Alternative Characterizations of AC

There are a few similar ways to state the axiom of choice. For example:

Given any set \(C\) containing non-empty sets which are pairwise disjoint (any two sets in \(C\) do not intersect), then there is some set \(X\) that intersects each of the sets in \(C\) in exactly one element.

There are also many alternative theorems which at first glance appear to be very different from the axiom of choice, but which are actually equivalent to it, in the sense that each of the theorems can both be proven from the axiom of choice and be used to prove it (in conjunction with the other ZF axioms).

A few examples:

  • Zorn’s Lemma (Described in more detail below).

  • Well-Ordering Theorem (Also described in more detail below).

  • The product of non-empty sets is non-empty.

  • Tarski’s Theorem for Binary Products (from the quote at the start of this article) that \(A\) is bijective to \(A \times A\).

  • Every surjective function has a right inverse.

  • Given two sets, either

  • Every vector space has a basis

  • Every connected graph has a spanning tree.

  • For any pair of sets have comparablecardinality: for any pair of sets \(X\) and \(Y\), either they are the same size, or one is smaller than the other.

More examples can be found on the Wikipedia page.

Because of the intuitive nature of some of these statements (especially that products are non-empty, that vector spaces have bases and that cardinalities are comparable), they are often used as evidence for the truth, or motivation for the use, of the Axiom of Choice.

Zorn’s Lemma? I hardly Know her!

The following is a specific example of a very common way in which the axiom of choice is used, called Zorn’s Lemma. It is a statement equivalent to the axiom of choice, but easier to use in many mathematical applications.

The statement is as follows:

Every partially ordered set (poset) for which every chain has an upper bound has a maximal element.

In other words, if you have an ordered set \(X\) and you consider any linearly ordered subset \(C\), and there is some element \(u \in X\) which is at least as large as any element in \(C\), i.e., \(u \geq c\) for any \(c \in C\), then there is an element \(m \in X\) which is maximal, in the sense that it is at least as big as any comparable element of \(X\), i.e., for any \(x \in X\), it holds that \(m \not< x\). (It is maximal, but not necessarily a global maximum. There is nothing in \(X\) lying above \(m\), but there may be incomparable elements).

Again, this is provable for a finite poset X, and for some infinite posets X, but not provable in general.

Now, why is this rather arcane statement useful?

Well, often for some type of mathematical structure we are interested in, all the structures of that type form a poset under inclusion, and if a maximal such structure existed, it would have a particularly nice property. Furthermore, for many structures, a union of all the structures in a upwards-increasing chain of such structures (under inclusion) is itself a structure of the right type, as well as an upper-bound for the chain. Then Zorn’s lemma gives us the maximal structure we are looking for.

As an example, consider a (esp. infinite-dimensional) vector space \(V\), and trying to find a basis for \(V\).

Choose any element \(v_1 \in V\). Now consider all possible linearly independent sets containing \(v\). These form a poset (which contains at least the set \(\{v_1\}\) since that is linearly independent). Now consider any chain, possibly infinite, of such sets. It looks like \(\{v_1\} \subseteq \{v, v_2\} \subseteq \{v_1, v_2, v_3 \} \subseteq \cdots\). Then take the union of all the sets in the chain \( \{v_1\} \cup \{v_1, v_2\} \cup \{v_1, v_2, v_3 \} \cdots = \{v_1, v_2, v_3, \ldots \}\). Call it \(B\). Then \(B\) contains all the elements in any of the sets in the chain.

It can be shown to be linearly independent, since if some element \(v_i\) could be formed as a linear combination of finitely many other elements in \(B\), then this could be done already in one of the sets in the chain.

Then every chain of such linearly indpendent sets has an upper bound, so the hypothesis of Zorn’s Lemma holds. Then by Zorn’s Lemma, there is a maximal element \(M\). By definition, this maximal element has no superset that is itself linearly independent. This set of vectors also spans \(V\) (every element of \(V\) can be written as a linear combination of vectors in \(M\)), since if it did not, then there would be a vector \(v \in V\) which is linearly independent of \(M\) (cannot be written as a linear combination of vectors in \(M\)) and then the set \(M \cup \{v\}\), which is \(M\) adjoined with \(v\), strictly contains \(M\), contradicting the maximality of \(M\).

Since the definition of a basis is a maximal linearly independent set spanning \(V\), the proof is done. QED.

One might wonder why Zorn’s lemma is even necessary. Why could we not just have picked the union of the chain as our basis? In a sense, we could have, provided we use the correct chain. For example, the chain of two elements \(\{v_1\}\) and \(\{v_1, v_2\}\) is not sufficient. We need an infinite chain (and, in fact, a large enough infinite chain at that) .
But there is a difference between being able to prove that any chain has an upper bound, and being able to actually choose a specific chain that works. In some sense, without Zorns lemma, we can reason in a very general vague way that, yes, the chains all have upper bounds, and there might be a long enough chain, and if there is then its upper bound will be the maximal element we need. Zorn’s lemma formalizes this intuition, and without it, lemma we can’t always pin down a specific chain which works.

Note how Zorn’s Lemma allows us to make infinitely many arbitrary choices as far as selecting elements of the infinite basis for the vector space is concerned. In general, this is where Zorn’s lemma comes in useful.

The upper bounds are necessary. For example, in the natural numbers \(\mathbb{N}\), the entirety of the natural numbers forms a chain. This chain has no upper bound in \(\mathbb{N}\). Also, the natural numbers do not have a maximal element.

Note that it may also be possible for a maximal element to exist without there being an upper bound. Consider for example the natural numbers with an ‘extra’ element which is not comparable to any of the numbers. Then this is a perfectly acceptable poset. Since this extra element is incomparable, then in particular there is no

Getting Your Ducks in a Row, or, Rather, Getting Your Real Numbers in a Row: The Well-Ordering Principle

A linearly ordered set is called well-ordered if any of its non-empty subsets has a least element.

For example, the natural numbers \(\mathbb{N}\) are well-ordered. Consider any non-empty subset of the natural numbers (e.g., \(\{42, 48, 64, \ldots\}\). It has a least element (e.g., \(42\)).

The positive real numbers (and in fact the positive rational numbers) are not well-ordered. There is no least element, since for any number bigger than zero (e.g. 13) it is possible to find a smaller number (e.g. 14) which is also bigger than zero.

The well-ordering principle states that any set is bijective to some well-ordered one. This basically states that you can have a well-ordered set of any size.

To see why this is surprising, try imaging a different linear order on the reals such that any subset you may choose—Any subset—has a least element.

Again, the Axiom of Choice allows us to do this.

In fact if we are always able to well order sets, then we are able to use it to make choice functions: imagine you needed to choose an element from each set in a set of sets, then you can just choose the least element from each set.

AC On a Budget: Weaker Versions of the Axiom

There are also theorems which do not follow from ZF, and which do follow from AC, but are not strong enough to use to prove AC. What this is equivalent to saying is that there are models of set theory in which these theorems are true, but for which the axiom of choice does not hold in full generality.

A few examples of such theorems:

  • The Hausdorff paradox and Banach-Tarski paradox (mentioned above).

  • A union of countably many countable sets is countable.

  • The axiom of dependent choice (given a non-empty set \(X\) and (entire) binary relation \(R\), there exists a sequence \((x_n)_{n \in \mathbb{N}}\) such that \(x_n\) is \(R\)-related to \(x_{n+1}\)) .

  • The axiom of countable choice (every countable set of sets has a choice function).

  • Every field has an algebraic closure

  • Existence of non-principal ultrafilters.

  • Gödel’s completeness theorem for first-order logic.

  • Boolean Prime Ideal Theorem (useful for proving existence of non-principal ultrafilters and Gödel’s completenes theorem).

  • The Law of excluded middle for logic.

More examples may be found on Wikipedia page.

And In Related News: The Continuum Hypothesis

Intuitively, the Continuum Hypothesis (CH) states that there is no set strictly bigger than the set of all natural numbers, but strictly smaller than the set of all real numbers. (These are two infinite sets, but they are different infinities).

The formal statement concerns cardinality of sets. In particular, it states that there is no set which has cardinality strictly larger than the set of natural numbers, but strictly smaller than the set of real numbers.

It is called the ‘Continuum Hypothesis’ because it concerns the size of the continuum (the set of real numbers) and was hypothesized to be true by Georg Cantor in 1878.

It is again independent of the Zermelo-Fraenkel axioms, and this was proven in the same manner and at the same time as the proof of the independence of AC from ZF (described in more detail above).

In fact, the continuum hypothesis was shown to be independent even from ZFC, (ZF with the Axiom of Choice). However, the continuum hypothesis implies the Axiom of Choice under ZF. In other words, given the ZF axioms, if you know that the Axiom of Choice is true then you do not yet know anything about the truth of Continuum Hypothesis. However, if you know that the Continuum Hypothesis is true, then you know the Axiom of Choice must also be true.

The Generalized Continuum Hypthosis (GCH) is, well, a generalized version of the Continuum Hypothesis, stating that not only are there no sets of size lying strictly between the natural numbers \(\mathbb{N}\) and the reals \(\mathbb{R}\), but that for any set \(X\), there is no set of size lying strictly between the sizes of \(X\) and the power set \(P(X)\). In particular, note that the reals \(\mathbb{R}\) is of the same size as the power set \(P(\mathbb{N})\) of the naturals, so that GCH implies CH. It is also strictly stronger than CH (it is not implied by CH).

Axiom of Choice Considered Harmful: Constructive Mathematics and the Potential Pitfalls of AC

Why does the axiom of choice have such a bad reputation with constructive mathematicians?

It is important to realise that some of the reasons mathematicians had for doubting AC are no longer relevant. Because of some of its counter-intuitive results, mathematicians

To understand this view, it is necessary to understand more about the constructive view in general. %TODO

(Posting a list of links here until I have a better idea of what to write, in approx. reverse order of relevance /​ usefulness) See also

Choosing Not to Choose: Set-Theoretic Axioms Which Contradict Choice

It is also possible to assume axioms which contradict the axiom of choice. For example, there is the Axiom of Determinancy. This axiom states that for any two-person game of a certain type, one player has a winning strategy. %TODO

I Want to Play a Game: Counterintuitive Strategies Using AC

There are a few examples of very counter-intuitive solutions to (seemingly) impossible challanges which work only in the presence of the Axiom of Choice.

Examples may be found in the following places: