Axiom of Choice

“The Ax­iom of Choice is nec­es­sary to se­lect a set from an in­finite num­ber of pairs of socks, but not an in­finite num­ber of pairs of shoes.” — Ber­trand Rus­sell, In­tro­duc­tion to math­e­mat­i­cal philosophy

“Tarski told me the fol­low­ing story. He tried to pub­lish his the­o­rem [the equiv­alence be­tween the Ax­iom of Choice and the state­ment ‘ev­ery in­finite set A has the same car­di­nal­ity as AxA’] in the Comptes Ren­dus Acad. Sci. Paris but Fréchet and Lebesgue re­fused to pre­sent it. Fréchet wrote that an im­pli­ca­tion be­tween two well known propo­si­tions is not a new re­sult. Lebesgue wrote that an im­pli­ca­tion be­tween two false propo­si­tions is of no in­ter­est. And Tarski said that af­ter this mis­ad­ven­ture he never again tried to pub­lish in the Comptes Ren­dus.”

  • Jan My­ciel­ski, A Sys­tem of Ax­ioms of Set The­ory for the Rationalists

Obli­ga­tory Introduction

The Ax­iom of Choice, the most con­tro­ver­sial ax­iom of the 20th Cen­tury.

The ax­iom states that a cer­tain kind of func­tion, called a `choice’ func­tion, always ex­ists. It is called a choice func­tion, be­cause, given a col­lec­tion of non-empty sets, the func­tion ‘chooses’ a sin­gle el­e­ment from each of the sets. It is a pow­er­ful and use­ful ax­iom, as­sert­ing the ex­is­tence of use­ful math­e­mat­i­cal struc­tures (such as bases for vec­tor spaces of ar­bi­trary di­men­sion, and ul­tra­prod­ucts). It is a gen­er­ally ac­cepted ax­iom, and is in wide use by math­e­mat­i­ci­ans. In fact, ac­cord­ing to Elliott Men­del­son in In­tro­duc­tion to Math­e­mat­i­cal Logic (1964) “The sta­tus of the Ax­iom of Choice has be­come less con­tro­ver­sial in re­cent years. To most math­e­mat­i­ci­ans it seems quite plau­si­ble and it has so many im­por­tant ap­pli­ca­tions in prac­ti­cally all branches of math­e­mat­ics that not to ac­cept it would seem to be a wilful hob­bling of the prac­tic­ing math­e­mat­i­cian. ”

Nev­er­less, be­ing an ax­iom, it can­not be proven and must in­stead be as­sumed. In par­tic­u­lar, it is an ax­iom of set the­ory and it is not prov­able from the other ax­ioms (the Zer­melo-Fraenkel ax­ioms of Set The­ory). In fact many math­e­mat­i­ci­ans, in par­tic­u­lar con­struc­tive math­e­mat­i­ci­ans, re­ject the ax­iom, stat­ing that it does not cap­ture a ‘real’ or ‘phys­i­cal’ prop­erty, but is in­stead just a math­e­mat­i­cal odd­ity, an arte­fact of the math­e­mat­ics used to ap­prox­i­mate re­al­ity, rather than re­al­ity it­self. In the words of the LessWrong com­mu­nity: the con­struc­tive math­e­mat­i­ci­ans would claim it is a state­ment about the map, and not the ter­ri­tory.

His­tor­i­cally, the ax­iom has ex­pe­rienced much con­tro­versy. Be­fore it was shown to be in­de­pen­dent of the other ax­ioms, it was be­lieved ei­ther to fol­low from them (i.e., be ‘True’) or lead to a con­tra­dic­tion (i.e., be ‘False’). Its in­de­pen­dence from the other ax­ioms was, in fact, a very sur­pris­ing re­sult at the time.

Get­ting the Heavy Maths out the Way: Definitions

In­tu­itively, the ax­iom of choice states that, given a col­lec­tion of non-empty sets, there is a func­tion which se­lects a sin­gle el­e­ment from each of the sets.

More for­mally, given a set \(X\) whose el­e­ments are only non-empty sets, there is a func­tion

$$ f: X \rightarrow \bigcup_{Y \in X} Y $$
from \(X\) to the union of all the el­e­ments of \(X\) such that, for each \(Y \in X\), the image of \(Y\) un­der \(f\) is an el­e­ment of \(Y\), i.e., \(f(Y) \in Y\).

In log­i­cal no­ta­tion,

$$ \forall_X \left( \left[\forall_{Y \in X} Y \not= \emptyset \right] \Rightarrow \left[\exists \left( f: X \rightarrow \bigcup_{Y \in X} Y \right) \left(\forall_{Y \in X} \exists_{y \in Y} f(Y) = y \right) \right] \right) $$

Ax­iom Un­nec­es­sary for Finite Col­lec­tions of Sets

For a finite set \(X\) con­tain­ing only finite non-empty sets, the ax­iom is ac­tu­ally prov­able (from the Zer­melo-Fraenkel ax­ioms of set the­ory ZF), and hence does not need to be given as an ax­iom. In fact, even for a finite col­lec­tion of pos­si­bly in­finite non-empty sets, the ax­iom of choice is prov­able (from ZF), us­ing the ax­iom of in­duc­tion. In this case, the func­tion can be ex­plic­itly de­scribed. For ex­am­ple, if the set \(X\) con­tains only three, po­ten­tially in­finite, non-empty sets \(Y_1, Y_2, Y_3\), then the fact that they are non-empty means they each con­tain at least one el­e­ment, say \(y_1 \in Y_1, y_2 \in Y_2, y_3 \in Y_3\). Then define \(f\) by \(f(Y_1) = y_1\), \(f(Y_2) = y_2\) and \(f(Y_3) = y_3\). This con­struc­tion is per­mit­ted by the ax­ioms ZF.

The prob­lem comes in if \(X\) con­tains an in­finite num­ber of non-empty sets. Let’s as­sume \(X\) con­tains a countable num­ber of sets \(Y_1, Y_2, Y_3, \ldots\). Then, again in­tu­itively speak­ing, we can ex­plic­itly de­scribe how \(f\) might act on finitely many of the \(Y\)s (say the first \(n\) for any nat­u­ral num­ber \(n\)), but we can­not de­scribe it on all of them at once.

To un­der­stand this prop­erly, one must un­der­stand what it means to be able to ‘de­scribe’ or ‘con­struct’ a func­tion \(f\). This is de­scribed in more de­tail in the sec­tions which fol­low. But first, a bit of back­ground on why the ax­iom of choice is in­ter­est­ing to math­e­mat­i­ci­ans.

Con­tro­versy: Math­e­mat­i­ci­ans Di­vided! Counter-In­tu­itive Re­sults, and The His­tory of the Ax­iom of Choice

Math­e­mat­i­ci­ans have been us­ing an in­tu­itive con­cept of a set for prob­a­bly as long as math­e­mat­ics has been prac­ticed. At first, math­e­mat­i­ci­ans as­sumed that the ax­iom of choice was sim­ply true (as in­deed it is for finite col­lec­tions of sets).

Ge­org Can­tor in­tro­duced the con­cept of trans­finite num­bers and differ­ent car­di­nal­ities of in­finity in a 1874 pa­per (which con­tains his in­fa­mous Di­ag­o­nal­iza­tion Ar­gu­ment) and along with this sparked the in­tro­duc­tion of set the­ory. In 1883, Can­tor in­tro­duced a prin­ci­ple called the ‘Well-Order­ing Princ­ple’ (dis­cussed fur­ther in a sec­tion be­low) which he called a ‘law of thought’ (i.e., in­tu­itively true). He at­tempted to prove this prin­ci­ple from his other prin­ci­ples, but found that he was un­able to do so.

Ernst Zer­melo at­tempted to de­velop an ax­io­matic treat­ment of set the­ory. He man­aged to prove the Well-Order­ing Prin­ci­ple in 1904 by in­tro­duc­ing a new prin­ci­ple: The Prin­ci­ple of Choice. This sparked much dis­cus­sion amongst math­e­mat­i­ci­ans. In 1908 pub­lished a pa­per con­tain­ing re­sponses to this de­bate, as well as a new for­mu­la­tion of the Ax­iom of Choice. In this year, he also pub­lished his first ver­sion of the set the­o­retic ax­ioms, known as the Zer­melo Ax­ioms of Set The­ory. Math­e­mat­i­ci­ans, Abra­ham Fraenkel and Tho­ralf Skolem im­proved this sys­tem (in­de­pen­dently of each other) into its mod­ern ver­sion, the Zer­melo Fraenkel Ax­ioms of Set The­ory.

In 1914, Felix Haus­dorff proved Haus­dorff’s para­dox. The ideas be­hind this proof were used in 1924 by Banach and Alfred Tarski to prove the more fa­mous Banach-Tarski para­dox (dis­cussed in more de­tail be­low). This lat­ter the­o­rem is of­ten quoted as ev­i­dence of the false­hood of the ax­iom of choice.

Between 1935 and 1938, Kurt Gödel proved that the Ax­iom of Choice is con­sis­tent with the rest of the ZF ax­ioms.

Fi­nally, in 1963, Paul Co­hen de­vel­oped a rev­olu­tion­ary math­e­mat­i­cal tech­nique called forc­ing, with which he proved that the ax­iom of choice could not be proven from the ZF ax­ioms (in par­tic­u­lar, that the nega­tion of AC is con­sis­tent with ZF). For this, and his proof of the con­sis­tency of the nega­tion of the Gen­er­al­ized Con­tinuum Hy­poth­e­sis from ZF, he was awarded a fields medal in 1966.

This ax­iom came to be ac­cepted in the gen­eral math­e­mat­i­cal com­mu­nity, but was re­jected by the con­struc­tive math­e­mat­i­ci­ans as be­ing fun­da­men­tally non-con­struc­tive. How­ever, it should be noted that in many forms of con­struc­tive math­e­mat­ics, there are prov­able ver­sions of the ax­iom of choice. The differ­ence is that in gen­eral in con­struc­tive math­e­mat­ics, ex­hibit­ing a set of non-empty sets (tech­ni­cally, in con­struc­tive set-the­ory, these should be ‘in­hab­ited’ sets) also amounts to ex­hibit­ing a proof that they are all non-empty, which amounts to ex­hibit­ing an el­e­ment for all of them, which amounts to ex­hibit­ing a func­tion choos­ing an el­e­ment in each. So in con­struc­tive math­e­mat­ics, to even state that you have a set of in­hab­ited sets re­quires stat­ing that you have a choice func­tion to these sets prov­ing they are all in­hab­ited.

Some ex­pla­na­tion of the his­tory of the ax­iom of choice (as well as some of its is­sues) can be found in the pa­per “100 years of Zer­melo’s ax­iom of choice: what was the prob­lem with it?” by the con­struc­tive math­e­mat­i­cian Per Martin-Löf at this web­page.

(Martin-Löf stud­ied un­der An­drey Kol­mogorov of Kol­mogorov com­plex­ity and has made con­tri­bu­tions to in­for­ma­tion the­ory, math­e­mat­i­cals­tatis­tics, and math­e­mat­i­cal<em>logic, in­clud­ing de­vel­op­ing a form of in­tu­ition­is­tic type the­ory).

A nice timeline is also sum­marised on Stan­ford En­cy­clopae­dia of Logic.

So, What is this Choice Thing Good for Any­ways?

The Ax­iom of Choice is in com­mon use by math­e­mat­i­ci­ans in prac­tice. Amongst its many ap­pli­ca­tions are the fol­low­ing:

Non-Empty Products

This is the stat­ment that tak­ing the math­e­mat­i­cal product of non-empty sets will always yield a non-empty set. Con­sider in­finitely many sets \(X_1, X_2, X_3, \ldots\) in­dexed by the nat­u­ral num­bers. Then an el­e­ment of the product \(\prod_{i \in \mathbb{N}} X_i\) is a fam­ily (es­sen­tially an in­finite tu­ple) of the form \((x_1, x_2, x_3, \ldots )\) where \(x_1 \in X_1\), \(x_2 \in X_2\) and so on. How­ever, to se­lect such a tu­ple in the product amounts to se­lect­ing a sin­gle el­e­ment from each of the sets. Hence, with­out the ax­iom of choice, it is not prov­able that there are any el­e­ments in this product.

Note, that with­out the ax­iom, it is not nec­es­sar­ily the case that the product is empty. It just isn’t prov­able that there are any el­e­ments. How­ever, it is con­sis­tent that there ex­ists some product that is empty. Also, bear in mind that even with­out the ax­iom of choice, there are prod­ucts of this form which are non-empty. It just can’t be shown that they are non-empty in gen­eral.

How­ever, it does seem some­what coun­ter­in­tu­itive that such a product be non-empty. In gen­eral, given two non-empty sets, \(X_1\) and \(X_2\), the product is at least as large as ei­ther of the sets. Ad­ding an­other non-empty set \(X_3\) usu­ally makes the product larger still. It may seem strange, then, that tak­ing the limit of this pro­ce­dure could re­sult in an empty struc­ture.

Ex­is­tence of a ba­sis for any (esp. in­finite di­men­sional) vec­tor space

This ex­am­ple is dis­cussed in more de­tail in the next sec­tion.

A vec­tor space can be in­tu­itively thought of as a col­lec­tion of vec­tors which them­selves can be thought of as ar­rows (or the in­for­ma­tion of a ‘di­rec­tion’ and a ‘mag­ni­tude’).

A vec­tor space has a num­ber of ‘di­rec­tions’ in which the vec­tors can point, referred to as its vec­tor space di­men­sion. For a finite-di­men­sional vec­tor space, it is pos­si­ble to find a ba­sis con­sist­ing of vec­tors. The prop­erty of a ba­sis is that:

  • any vec­tor in the space can be built up as a com­bi­na­tion of the vec­tors in the ba­sis

  • none of the vec­tors in the ba­sis can be built up as a com­bi­na­tion of the rest of the vec­tors in the ba­sis.

For finite-di­men­sional vec­tor spaces, such a ba­sis is finite and can be found. How­ever, for in­finite-di­men­sional vec­tor spaces, a way of find­ing a ba­sis does not always ex­ist. In fact, if the ax­iom of choice is false, then there is an in­finite-di­men­sional vec­tor space for which it is im­pos­si­ble to find a ba­sis.

Brouwer’s Fixed-Point The­o­rem: the ex­is­tence of a fixed point for a func­tion from a “nice” shape to itself

Any con­tin­u­ous func­tions \(f: C \rightarrow C\) from a closed disk \(C\) onto it­self has a fixed point \(x_0\). (In full gen­er­al­ity, \(C\) may be any con­vex com­pact set).

In other words, there is at least one point \(x_0 \in C\) such that \(f(x_0) = x_0\).

This works also for rec­t­an­gles, and even in mul­ti­ple di­men­sions. Hence, if true for real world ob­jects, this the­o­rem has the fol­low­ing some­what sur­pris­ing con­se­quences:

  • Take two iden­ti­cal sheets of graph with marked co­or­di­nates. Crum­ple up sheet B and place the crum­pled ball on top of sheet A. Then, there is some co­or­di­nate \((x , y)\) (where \(x\) and \(y\) are real num­bers, not nec­es­sar­ily in­te­gers), such that this co­or­di­nate on sheet B is di­rectly above that co­or­di­nate on sheet A.

  • Take a cup of tea. Stir it and let it set­tle. There is some point of the tea which ends up in the same place it started.

  • Take a map of France. Place it on the ground in France. Take a pin. There is a point on the map through which you could stick the pin and the pin will also stick into the ground at the point rep­re­sented on the map.

Ex­is­tence of ul­tra­filters and hence ultraproducts

The fol­low­ing ex­am­ple is some­what tech­ni­cal. An at­tempt is made to de­scribe it very roughly.

Given an in­dex­ing set \(I\), and a col­lec­tion of math­e­mat­i­cal struc­tures \((A_i)_{i \in I}\) (of a cer­tain type) in­dexed by \(I\). (For ex­am­ple, let \(I\) be the nat­u­ral num­bers \(\mathbb{N}\) and let each of the math­e­mat­i­cal struc­tures \(A_n\) be num­bered).

An ul­tra­filter \(\mathcal{U}\) on \(I\) is a col­lec­tion of sub­sets of \(I\) of a spe­cial type. In­tu­itively it should be thought of as a col­lec­tion of `big sub­sets’ of \(I\). It is pos­si­ble to form the set of all cofinite sub­sets of \(I\) with­out the ax­iom of choice, and \(\mathcal{U}\) should con­tain at least these. How­ever, for math­e­mat­i­cal rea­sons, \(\mathcal{U}\) should also con­tain ‘as many sets as pos­si­ble’. How­ever, in or­der to do so, there are some ‘ar­bi­trary choices’ that have to be made. This is where the ax­iom of choice comes in.

One of the ap­pli­ca­tions of ul­tra­filters is ul­tra­prod­ucts. For each sub­set \(X \subseteq I\) such that \(X \in \mathcal{U}\) there is a sub­col­lec­tion \((A_i)_{i \in X}\) of \((A_i)_{i \in I}\). Call such a sub­col­lec­tion a “large col­lec­tion”. The ul­tra­product \(A\) is a struc­ture that cap­tures the prop­er­ties of large col­lec­tions of the \(A_i\)s, in the sense that a state­ment of first or­der logic is true of the ul­tra­product \(A\) if and only if it is true of some large col­lec­tion of the \(A_i\)s.

Now, any state­ment that is ei­ther true for cofinitely many \(A_i\)s or false for cofinitely many \(A_i\)s will be true or false re­spec­tively for \(A\). But what about the other state­ments? This is where the ar­bi­trary choices come in. Each state­ment needs to be ei­ther true or false of \(A\), and we use the ax­iom of choice to form an ul­tra­filter that does that for us.

One ba­sic ex­am­ple of an ap­pli­ca­tion of ul­tra­filters is form­ing the non­stan­dard real num­bers.

Fur­ther ex­am­ples of ap­pli­ca­tions of the ax­iom of choice may be found on the Wikipe­dia page here and here.

Physi­cists Hate Them! Find out How Banach and Tarski Make In­finity Dol­lars with this One Sim­ple Trick!

One of the most counter-in­tu­itive con­se­quences of the Ax­iom of Choice is the Banach-Tarski Para­dox. It is a the­o­rem prov­able us­ing the Zer­melo-Fraenkel ax­ioms along with the ax­iom of choice.

This the­o­rem was proven in a 1924 pa­per by Ste­fan Banach and Alfred Tarski.

In­tu­itively, what the the­o­rem says is that it is pos­si­ble to take a ball, cut it into five pieces, ro­tate and shift these pieces and end up with two balls.

Now, there are some com­pli­ca­tions, in­clud­ing the fact the pieces them­selves are in­finitely com­plex, and the have to pass through each other when they are be­ing shifted. There is no way a prac­ti­cal im­ple­men­ta­tion of this the­o­rem could be de­vel­oped. Nev­er­the­less, that the vol­ume of a ball could be changed just by cut­ting, ro­tat­ing and shift­ing seems highly counter-in­tu­itive.

A supris­ingly good video ex­pla­na­tion in lay­mans terms by Vsauce can be found Here.

The video In­finity shapeshifter vs. Banach-Tarski para­dox by Matholo­gor ad­ver­tises it­self as a pre­quel to the above video, which puts you ‘in the mind­set’ of a math­e­mat­i­cian, so-to-speak, and makes the re­sult a bit less sur­pris­ing.

This the­o­rem is the main coun­terex­am­ple used as ev­i­dence of the false­hood of the Ax­iom of Choice. If not taken as ev­i­dence of its false­hood, thsi is at least used as ev­i­dence of the counter-in­tu­itive­ness of AC.

Q: What’s an Ana­gram of Banach-Tarski? A: Banach-Tarski Banach Tarski

How Some­thing Can Ex­ist Without Ac­tu­ally Ex­ist­ing: The Zer­melo Fraenkel Ax­ioms and the Ex­is­tence of a Choice Function

The Zer­melo-Fraenkel ax­ioms of Set The­ory (ZF) in­tro­duce a fun­da­men­tal con­cept, called a set, and a re­la­tion be­tween sets, called the el­e­ment of re­la­tion (us­ally writ­ten \(\in\)) where \(x \in X\) should be in­ter­preted as ‘$x$ is con­tained in­side \(X\) in the way that an item is con­tained in­side a box’. There are then a num­ber of ax­ioms im­posed on these fun­da­men­tal ob­jects and this re­la­tion.

What one must re­mem­ber is that the­o­rems de­rived from these ax­ioms are merely state­ments of the form that ‘if one has a sys­tem which satis­fies these laws (even if, for ex­am­ple \(\in\) is in­ter­preted as some­thing en­tirely differ­ent from be­ing con­tained in­side some­thing), then it must also satisfy the state­ments of the the­o­rems’. How­ever, its gen­eral use is to imag­ine sets as be­ing some­thing like boxes which con­tain math­e­mat­i­cal ob­jects. More­over, al­most any state­ment of math­e­mat­ics can be stated in terms of sets, where the math­e­mat­i­cal ob­jects in ques­tion be­come sets of a cer­tain kind. In this way, since the math­e­mat­i­cal ob­jects in ques­tion are set up to satisfy the ax­ioms, then any­thing which can be de­rived from these ax­ioms will also hold for the math­e­mat­i­cal ob­jects.

In par­tic­u­lar, a func­tion can be in­ter­preted as a spe­cific kind of set. In par­tic­u­lar, it is a set of or­dered pairs (more gen­er­ally, or­dered n-tu­ples, each of which can it­self be in­ter­preted as a spe­cific kind of set) satis­fy­ing a spe­cific prop­erty.

There are differ­ent ways of stat­ing the same ax­ioms (by sep­a­rat­ing or com­bin­ing ax­ioms, giv­ing differ­ent for­mu­la­tions of the same ax­ioms, or giv­ing differ­ent ax­ioms that are equiv­a­lent given the other ax­ioms) hence what fol­lows is only a spe­cific for­mu­la­tion, namely, the Zer­melo-Fraenkel one from Wikipe­dia.

The ax­ioms be­gin by stat­ing that two sets are the same if they have the same el­e­ments. Then the ax­iom reg­u­lar­ity states sets are well-be­haved in a cer­tain way that’s not so im­por­tant to us right now.

Now comes the part that is im­por­tant for our pur­poses: The ax­iom schema of speci­fi­ca­tion (ac­tu­ally a schema spec­i­fy­ing in­finitely many ax­ioms, but we can pre­tend it is just one ax­iom for now). This is an ax­iom as­sert­ing the ex­is­tence of cer­tain sets. In a sense, it al­lows one to ‘cre­ate’ a new set out of an ex­ist­ing one. Namely, given a set \(X\) and a state­ment \(\phi\) of first or­der logic (a state­ment about sets of a spe­cific, very for­mal form, and which uses only the \(\in\) sym­bol and the re­served sym­bols of logic), it is pos­si­ble to cre­ate a set \(\{x \in X : \phi(x) \}\) of all of the el­e­ments of \(X\) for which the for­mula \(\phi\) is true.

For ex­am­ple, if we know (or as­sume) the set of all num­bers \(\mathbb{N}\) ex­ists, and we have some way of for­mal­is­ing the state­ment ‘$x$ is an even num­ber’ as a first-or­der state­ment \(\phi(x)\), then the set of all even num­bers ex­ists.

Ad­di­tion­ally the ax­ioms of pairing and union, ax­iom schema of re­place­ment, and ax­iom of power set are all of the form “Given that some sets \(A, B, C, \ldots\) ex­ist, then some other set \(X\) also ex­ists. The ax­iom of in­finity sim­ply states that an in­finite set with cer­tain prop­er­ties ex­ists.

No­tice that all of the above are ax­ioms. It is not ex­pected that any of them be proven. They are sim­ply as­sump­tions that you make about whichever sys­tem you want to rea­son. Any the­o­rems that you can prove from these ax­ioms will then be true about your sys­tem. How­ever, math­e­mat­i­ci­ans gen­er­ally agree that these ax­ioms cap­ture our in­tu­itive no­tion about how “sets” of ob­jects (or even con­cepts) should be­have, and about which sets we are al­lowed to rea­son (which sets ‘ex­ist’). Most of these (ex­cept maybe the ax­iom of in­finity, and even that one pos­si­bly) seem to ap­ply to our world and seem to work fine.

Now, the last ax­iom, the ax­iom of choice (or the well-or­der­ing prin­ci­ple) as­serts that a cer­tain kind of func­tion ex­ists. It can­not be proven from the above. In other words, given a sys­tem that satis­fies all of the above, it can­not be as­sumed that the sys­tem also satis­fies this ax­iom (nor in fact that it does not satisfy this ax­iom). That’s all there is to it, re­ally.

Yet, math­e­mat­i­ci­ans do dis­agree about this ax­iom, and whether it ap­plies to our world as well. Some math­e­mat­i­ci­ans take a Pla­tonic view of math­e­mat­ics, in which math­e­mat­i­cal ob­jects such as sets ac­tu­ally ex­ist in some ab­stract realm, and for which the ax­iom of choice is ei­ther true or false, but we do not know which. Others take a highly con­struc­tive view (in many cases mo­ti­vated by re­al­ism and the abil­ity of the math­e­mat­ics to model the world) in which ei­ther the ax­iom of choice is false, or in­finite sets do not ex­ist in which case the ax­iom of choice is prov­able and hence su­perflu­ous. Others take the view that the ax­iom is not true or false, but merely use­less, and that any­thing prov­able from it is mean­ingless. Many seem not to care: the ax­iom is con­ve­nient for the math­e­mat­ics they wish to do (whose ap­pli­ca­tion they are not much con­cerned about in any case) and hence they as­sume it with­out qualm.

How Some­thing can be Nei­ther True nor False:

How Could we Pos­si­bly Know that AC is In­de­pen­dent of ZF?# It has been stated above mul­ti­ple times that the Ax­iom of Choice is in­de­pen­dent from the other Zer­melo-Fraenkel ax­ioms. In other words, it can nei­ther be proven nor dis­proven from those ax­ioms. But how can we pos­si­bly know this fact?

The an­swer lies in mod­els. How­ever, these are not phys­i­cal or even com­pu­ta­tional mod­els, but mod­els of a more ab­stract kind.

For ex­am­ple, a model of group the­ory is a spe­cific group, which it­self can be char­ac­ter­ized as a spe­cific set. Now, no­tice that the ax­ioms of group the­ory say noth­ing about whether a given group is abelian (com­mu­ta­tive) or not. It does not fol­low from the ax­ioms that for groups it is always true that \(xy = yx\), nor does it fol­low that for groups there are always some \(x\) and \(y\) for which \(xy \not= yx\). In other words, the “abelian ax­iom” is in­de­pen­dent of the ax­ioms of group the­ory. How do we know this fact? We need sim­ply ex­hibit two mod­els, two groups, one of which is abelian and the other not. For these groups, I pick, oh say the cyclic group on 3 el­e­ments and the sym­met­ric group \(S_3\). The first is abelian, the sec­ond, not.

In or­der to rea­son about such mod­els of set the­ory, one as­sumes the ex­is­tence of “meta-sets” in some meta-the­ory. The en­tire “uni­verse” of set the­ory is then a cer­tain “meta-set” be­hav­ing in a cer­tain way. In case this feels too much like cheat­ing, this Stack­Ex­change an­swer should help clear things up. In par­tic­u­lar, the fol­low­ing quote from VanLiere’s the­sis:

″ Since these ques­tions all have to do with first-or­der prov­abil­ity, we could take as our metathe­ory some very weak the­ory (such as Peano ar­ith­metic) which is suffi­cient for for­mal­iz­ing first-or­der logic. How­ever, as is cus­tom­ary in trea­tises about set the­ory, we take as our metathe­ory ZF plus the Ax­iom of choice in or­der to have at our dis­posal the in­fini­tary tools of model the­ory. We will also use lo­cu­tions such as … which are only re­ally jus­tifi­able in some even stronger metathe­ory with the un­der­stand­ing that they could be elimi­nated through the use of Boolean-val­ued mod­els or some other de­vice. ”

In other words, it is pos­si­ble to form these mod­els in some weaker the­ory on which we have “more of a grip”. The en­tirity of set the­ory are then spe­cial ob­jects satis­fy­ing the ax­ioms of this weaker the­ory. Is it pos­si­ble to re­peat this pro­cess ad in­fini­tum? No. But if we want, we could even deal with just a finite frag­ment of set the­ory: We as­sume that any math­e­mat­ics we want to do only needs a finite num­ber of the (in­finitely many) ax­ioms of set the­ory. We then prove what we want about this finite frag­ment. But we may as well have proved it about the whole the­ory.

Now, pick (or con­struct) two spe­cific ob­jects of the meta-the­ory such that in one of them, the ax­iom of choice is true, and that in the other, the ax­iom of choice is false.

To ob­tain these two mod­els re­quires vastly differ­ent ap­proaches which will not be de­scribed in de­tail here. More de­tail can be found on­line in Kunen’s text. The con­sis­tency of choice is the eas­ier di­rec­tion, and in­volves con­struct­ing some­thing called Gödel’s con­structible uni­verse of sets (or just Gödel’s uni­verse or the con­structible uni­verse). The con­sis­tency of the nega­tion of choice is more difficult, and in­volves a tech­nique de­vel­oped by Paul Co­hen called forc­ing.

A Rose by Any Other Name: Alter­na­tive Char­ac­ter­i­za­tions of AC

There are a few similar ways to state the ax­iom of choice. For ex­am­ple:

Given any set \(C\) con­tain­ing non-empty sets which are pair­wise dis­joint (any two sets in \(C\) do not in­ter­sect), then there is some set \(X\) that in­ter­sects each of the sets in \(C\) in ex­actly one el­e­ment.

There are also many al­ter­na­tive the­o­rems which at first glance ap­pear to be very differ­ent from the ax­iom of choice, but which are ac­tu­ally equiv­a­lent to it, in the sense that each of the the­o­rems can both be proven from the ax­iom of choice and be used to prove it (in con­junc­tion with the other ZF ax­ioms).

A few ex­am­ples:

  • Zorn’s Lemma (De­scribed in more de­tail be­low).

  • Well-Order­ing The­o­rem (Also de­scribed in more de­tail be­low).

  • The product of non-empty sets is non-empty.

  • Tarski’s The­o­rem for Bi­nary Prod­ucts (from the quote at the start of this ar­ti­cle) that \(A\) is bi­jec­tive to \(A \times A\).

  • Every sur­jec­tive func­tion has a right in­verse.

  • Given two sets, ei­ther

  • Every vec­tor space has a basis

  • Every con­nected graph has a span­ning tree.

  • For any pair of sets have com­pa­rable­car­di­nal­ity: for any pair of sets \(X\) and \(Y\), ei­ther they are the same size, or one is smaller than the other.

More ex­am­ples can be found on the Wikipe­dia page.

Be­cause of the in­tu­itive na­ture of some of these state­ments (es­pe­cially that prod­ucts are non-empty, that vec­tor spaces have bases and that car­di­nal­ities are com­pa­rable), they are of­ten used as ev­i­dence for the truth, or mo­ti­va­tion for the use, of the Ax­iom of Choice.

Zorn’s Lemma? I hardly Know her!

The fol­low­ing is a spe­cific ex­am­ple of a very com­mon way in which the ax­iom of choice is used, called Zorn’s Lemma. It is a state­ment equiv­a­lent to the ax­iom of choice, but eas­ier to use in many math­e­mat­i­cal ap­pli­ca­tions.

The state­ment is as fol­lows:

Every par­tially or­dered set (poset) for which ev­ery chain has an up­per bound has a max­i­mal el­e­ment.

In other words, if you have an or­dered set \(X\) and you con­sider any lin­early or­dered sub­set \(C\), and there is some el­e­ment \(u \in X\) which is at least as large as any el­e­ment in \(C\), i.e., \(u \geq c\) for any \(c \in C\), then there is an el­e­ment \(m \in X\) which is max­i­mal, in the sense that it is at least as big as any com­pa­rable el­e­ment of \(X\), i.e., for any \(x \in X\), it holds that \(m \not< x\). (It is max­i­mal, but not nec­es­sar­ily a global max­i­mum. There is noth­ing in \(X\) ly­ing above \(m\), but there may be in­com­pa­rable el­e­ments).

Again, this is prov­able for a finite poset X, and for some in­finite posets X, but not prov­able in gen­eral.

Now, why is this rather ar­cane state­ment use­ful?

Well, of­ten for some type of math­e­mat­i­cal struc­ture we are in­ter­ested in, all the struc­tures of that type form a poset un­der in­clu­sion, and if a max­i­mal such struc­ture ex­isted, it would have a par­tic­u­larly nice prop­erty. Fur­ther­more, for many struc­tures, a union of all the struc­tures in a up­wards-in­creas­ing chain of such struc­tures (un­der in­clu­sion) is it­self a struc­ture of the right type, as well as an up­per-bound for the chain. Then Zorn’s lemma gives us the max­i­mal struc­ture we are look­ing for.

As an ex­am­ple, con­sider a (esp. in­finite-di­men­sional) vec­tor space \(V\), and try­ing to find a ba­sis for \(V\).

Choose any el­e­ment \(v_1 \in V\). Now con­sider all pos­si­ble lin­early in­de­pen­dent sets con­tain­ing \(v\). Th­ese form a poset (which con­tains at least the set \(\{v_1\}\) since that is lin­early in­de­pen­dent). Now con­sider any chain, pos­si­bly in­finite, of such sets. It looks like \(\{v_1\} \subseteq \{v, v_2\} \subseteq \{v_1, v_2, v_3 \} \subseteq \cdots\). Then take the union of all the sets in the chain \( \{v_1\} \cup \{v_1, v_2\} \cup \{v_1, v_2, v_3 \} \cdots = \{v_1, v_2, v_3, \ldots \}\). Call it \(B\). Then \(B\) con­tains all the el­e­ments in any of the sets in the chain.

It can be shown to be lin­early in­de­pen­dent, since if some el­e­ment \(v_i\) could be formed as a lin­ear com­bi­na­tion of finitely many other el­e­ments in \(B\), then this could be done already in one of the sets in the chain.

Then ev­ery chain of such lin­early in­d­pen­dent sets has an up­per bound, so the hy­poth­e­sis of Zorn’s Lemma holds. Then by Zorn’s Lemma, there is a max­i­mal el­e­ment \(M\). By defi­ni­tion, this max­i­mal el­e­ment has no su­per­set that is it­self lin­early in­de­pen­dent. This set of vec­tors also spans \(V\) (ev­ery el­e­ment of \(V\) can be writ­ten as a lin­ear com­bi­na­tion of vec­tors in \(M\)), since if it did not, then there would be a vec­tor \(v \in V\) which is lin­early in­de­pen­dent of \(M\) (can­not be writ­ten as a lin­ear com­bi­na­tion of vec­tors in \(M\)) and then the set \(M \cup \{v\}\), which is \(M\) ad­joined with \(v\), strictly con­tains \(M\), con­tra­dict­ing the max­i­mal­ity of \(M\).

Since the defi­ni­tion of a ba­sis is a max­i­mal lin­early in­de­pen­dent set span­ning \(V\), the proof is done. QED.

One might won­der why Zorn’s lemma is even nec­es­sary. Why could we not just have picked the union of the chain as our ba­sis? In a sense, we could have, pro­vided we use the cor­rect chain. For ex­am­ple, the chain of two el­e­ments \(\{v_1\}\) and \(\{v_1, v_2\}\) is not suffi­cient. We need an in­finite chain (and, in fact, a large enough in­finite chain at that) .
But there is a differ­ence be­tween be­ing able to prove that any chain has an up­per bound, and be­ing able to ac­tu­ally choose a spe­cific chain that works. In some sense, with­out Zorns lemma, we can rea­son in a very gen­eral vague way that, yes, the chains all have up­per bounds, and there might be a long enough chain, and if there is then its up­per bound will be the max­i­mal el­e­ment we need. Zorn’s lemma for­mal­izes this in­tu­ition, and with­out it, lemma we can’t always pin down a spe­cific chain which works.

Note how Zorn’s Lemma al­lows us to make in­finitely many ar­bi­trary choices as far as se­lect­ing el­e­ments of the in­finite ba­sis for the vec­tor space is con­cerned. In gen­eral, this is where Zorn’s lemma comes in use­ful.

The up­per bounds are nec­es­sary. For ex­am­ple, in the nat­u­ral num­bers \(\mathbb{N}\), the en­tirety of the nat­u­ral num­bers forms a chain. This chain has no up­per bound in \(\mathbb{N}\). Also, the nat­u­ral num­bers do not have a max­i­mal el­e­ment.

Note that it may also be pos­si­ble for a max­i­mal el­e­ment to ex­ist with­out there be­ing an up­per bound. Con­sider for ex­am­ple the nat­u­ral num­bers with an ‘ex­tra’ el­e­ment which is not com­pa­rable to any of the num­bers. Then this is a perfectly ac­cept­able poset. Since this ex­tra el­e­ment is in­com­pa­rable, then in par­tic­u­lar there is no

Get­ting Your Ducks in a Row, or, Rather, Get­ting Your Real Num­bers in a Row: The Well-Order­ing Principle

A lin­early or­dered set is called well-or­dered if any of its non-empty sub­sets has a least el­e­ment.

For ex­am­ple, the nat­u­ral num­bers \(\mathbb{N}\) are well-or­dered. Con­sider any non-empty sub­set of the nat­u­ral num­bers (e.g., \(\{42, 48, 64, \ldots\}\). It has a least el­e­ment (e.g., \(42\)).

The pos­i­tive real num­bers (and in fact the pos­i­tive ra­tio­nal num­bers) are not well-or­dered. There is no least el­e­ment, since for any num­ber big­ger than zero (e.g. 13) it is pos­si­ble to find a smaller num­ber (e.g. 14) which is also big­ger than zero.

The well-or­der­ing prin­ci­ple states that any set is bi­jec­tive to some well-or­dered one. This ba­si­cally states that you can have a well-or­dered set of any size.

To see why this is sur­pris­ing, try imag­ing a differ­ent lin­ear or­der on the re­als such that any sub­set you may choose—Any sub­set—has a least el­e­ment.

Again, the Ax­iom of Choice al­lows us to do this.

In fact if we are always able to well or­der sets, then we are able to use it to make choice func­tions: imag­ine you needed to choose an el­e­ment from each set in a set of sets, then you can just choose the least el­e­ment from each set.

AC On a Bud­get: Weaker Ver­sions of the Axiom

There are also the­o­rems which do not fol­low from ZF, and which do fol­low from AC, but are not strong enough to use to prove AC. What this is equiv­a­lent to say­ing is that there are mod­els of set the­ory in which these the­o­rems are true, but for which the ax­iom of choice does not hold in full gen­er­al­ity.

A few ex­am­ples of such the­o­rems:

  • The Haus­dorff para­dox and Banach-Tarski para­dox (men­tioned above).

  • A union of countably many countable sets is countable.

  • The ax­iom of de­pen­dent choice (given a non-empty set \(X\) and (en­tire) bi­nary re­la­tion \(R\), there ex­ists a se­quence \((x_n)_{n \in \mathbb{N}}\) such that \(x_n\) is \(R\)-re­lated to \(x_{n+1}\)) .

  • The ax­iom of countable choice (ev­ery countable set of sets has a choice func­tion).

  • Every field has an alge­braic closure

  • Ex­is­tence of non-prin­ci­pal ul­tra­filters.

  • Gödel’s com­plete­ness the­o­rem for first-or­der logic.

  • Boolean Prime Ideal The­o­rem (use­ful for prov­ing ex­is­tence of non-prin­ci­pal ul­tra­filters and Gödel’s com­pletenes the­o­rem).

  • The Law of ex­cluded mid­dle for logic.

More ex­am­ples may be found on Wikipe­dia page.

And In Re­lated News: The Con­tinuum Hypothesis

In­tu­itively, the Con­tinuum Hy­poth­e­sis (CH) states that there is no set strictly big­ger than the set of all nat­u­ral num­bers, but strictly smaller than the set of all real num­bers. (Th­ese are two in­finite sets, but they are differ­ent in­fini­ties).

The for­mal state­ment con­cerns car­di­nal­ity of sets. In par­tic­u­lar, it states that there is no set which has car­di­nal­ity strictly larger than the set of nat­u­ral num­bers, but strictly smaller than the set of real num­bers.

It is called the ‘Con­tinuum Hy­poth­e­sis’ be­cause it con­cerns the size of the con­tinuum (the set of real num­bers) and was hy­poth­e­sized to be true by Ge­org Can­tor in 1878.

It is again in­de­pen­dent of the Zer­melo-Fraenkel ax­ioms, and this was proven in the same man­ner and at the same time as the proof of the in­de­pen­dence of AC from ZF (de­scribed in more de­tail above).

In fact, the con­tinuum hy­poth­e­sis was shown to be in­de­pen­dent even from ZFC, (ZF with the Ax­iom of Choice). How­ever, the con­tinuum hy­poth­e­sis im­plies the Ax­iom of Choice un­der ZF. In other words, given the ZF ax­ioms, if you know that the Ax­iom of Choice is true then you do not yet know any­thing about the truth of Con­tinuum Hy­poth­e­sis. How­ever, if you know that the Con­tinuum Hy­poth­e­sis is true, then you know the Ax­iom of Choice must also be true.

The Gen­er­al­ized Con­tinuum Hyptho­sis (GCH) is, well, a gen­er­al­ized ver­sion of the Con­tinuum Hy­poth­e­sis, stat­ing that not only are there no sets of size ly­ing strictly be­tween the nat­u­ral num­bers \(\mathbb{N}\) and the re­als \(\mathbb{R}\), but that for any set \(X\), there is no set of size ly­ing strictly be­tween the sizes of \(X\) and the power set \(P(X)\). In par­tic­u­lar, note that the re­als \(\mathbb{R}\) is of the same size as the power set \(P(\mathbb{N})\) of the nat­u­rals, so that GCH im­plies CH. It is also strictly stronger than CH (it is not im­plied by CH).

Ax­iom of Choice Con­sid­ered Harm­ful: Con­struc­tive Math­e­mat­ics and the Po­ten­tial Pit­falls of AC

Why does the ax­iom of choice have such a bad rep­u­ta­tion with con­struc­tive math­e­mat­i­ci­ans?

It is im­por­tant to re­al­ise that some of the rea­sons math­e­mat­i­ci­ans had for doubt­ing AC are no longer rele­vant. Be­cause of some of its counter-in­tu­itive re­sults, math­e­mat­i­ci­ans

To un­der­stand this view, it is nec­es­sary to un­der­stand more about the con­struc­tive view in gen­eral. %TODO

(Post­ing a list of links here un­til I have a bet­ter idea of what to write, in ap­prox. re­verse or­der of rele­vance /​ use­ful­ness) See also

Choos­ing Not to Choose: Set-The­o­retic Ax­ioms Which Con­tra­dict Choice

It is also pos­si­ble to as­sume ax­ioms which con­tra­dict the ax­iom of choice. For ex­am­ple, there is the Ax­iom of Deter­mi­nancy. This ax­iom states that for any two-per­son game of a cer­tain type, one player has a win­ning strat­egy. %TODO

I Want to Play a Game: Coun­ter­in­tu­itive Strate­gies Us­ing AC

There are a few ex­am­ples of very counter-in­tu­itive solu­tions to (seem­ingly) im­pos­si­ble chal­langes which work only in the pres­ence of the Ax­iom of Choice.

Ex­am­ples may be found in the fol­low­ing places: