Coherent decisions imply consistent utilities

In­tro­duc­tion to the in­tro­duc­tion: Why ex­pected util­ity?

So we’re talk­ing about how to make good de­ci­sions, or the idea of ‘bounded ra­tio­nal­ity’, or what suffi­ciently ad­vanced Ar­tifi­cial In­tel­li­gences might be like; and some­body starts drag­ging up the con­cepts of ‘ex­pected util­ity’ or ‘util­ity func­tions’.

And be­fore we even ask what those are, we might first ask, Why?

There’s a math­e­mat­i­cal for­mal­ism, ‘ex­pected util­ity’, that some peo­ple in­vented to talk about mak­ing de­ci­sions. This for­mal­ism is very aca­dem­i­cally pop­u­lar, and ap­pears in all the text­books.

But so what? Why is that nec­es­sar­ily the best way of mak­ing de­ci­sions un­der ev­ery kind of cir­cum­stance? Why would an Ar­tifi­cial In­tel­li­gence care what’s aca­dem­i­cally pop­u­lar? Maybe there’s some bet­ter way of think­ing about ra­tio­nal agency? Heck, why is this for­mal­ism pop­u­lar in the first place?

We can ask the same kinds of ques­tions about prob­a­bil­ity the­ory:

Okay, we have this math­e­mat­i­cal for­mal­ism in which the chance that X hap­pens, aka \(\mathbb P(X),\) plus the chance that X doesn’t hap­pen, aka \(\mathbb P(\neg X),\) must be rep­re­sented in a way that makes the two quan­tities sum to unity: \(\mathbb P(X) + \mathbb P(\neg X) = 1.\)

That for­mal­ism for prob­a­bil­ity has some neat math­e­mat­i­cal prop­er­ties. But so what? Why should the best way of rea­son­ing about a messy, un­cer­tain world have neat prop­er­ties? Why shouldn’t an agent rea­son about ‘how likely is that’ us­ing some­thing com­pletely un­like prob­a­bil­ities? How do you know a suffi­ciently ad­vanced Ar­tifi­cial In­tel­li­gence would rea­son in prob­a­bil­ities? You haven’t seen an AI, so what do you think you know and how do you think you know it?

That en­tirely rea­son­able ques­tion is what this in­tro­duc­tion tries to an­swer. There are, in­deed, ex­cel­lent rea­sons be­yond aca­demic habit and math­e­mat­i­cal con­ve­nience for why we would by de­fault in­voke ‘ex­pected util­ity’ and ‘prob­a­bil­ity the­ory’ to think about good hu­man de­ci­sions, talk about ra­tio­nal agency, or rea­son about suffi­ciently ad­vanced AIs.

The broad form of the an­swer seems eas­ier to show than to tell, so we’ll just plunge straight in.

Why not cir­cu­lar prefer­ences?

De gustibus non est dis­putan­dum, goes the proverb; mat­ters of taste can­not be dis­puted. If I like onions on my pizza and you like pineap­ple, it’s not that one of us is right and one of us is wrong. We just pre­fer differ­ent pizza top­pings.

Well, but sup­pose I de­clare to you that I si­mul­ta­neously:

  • Pre­fer onions to pineap­ple on my pizza.

  • Pre­fer pineap­ple to mush­rooms on my pizza.

  • Pre­fer mush­rooms to onions on my pizza.

If we use \(>_P\) to de­note my pizza prefer­ences, with \(X >_P Y\) de­not­ing that I pre­fer X to Y, then I am declar­ing:

$$\text{onions} >_P \text{pineapple} >_P \text{mushrooms} >_P \text{onions}$$

That sounds strange, to be sure. But is there any­thing wrong with that? Can we dis­putan­dum it?

We used the math sym­bol \(>\) which de­notes an or­der­ing. If we ask whether \(>_P\) can be an or­der­ing, it naugh­tily vi­o­lates the stan­dard tran­si­tivity ax­iom \(x > y, y > z \implies x > z\).

Okay, so then maybe we shouldn’t have used the sym­bol \(>_P\) or called it an or­der­ing. Why is that nec­es­sar­ily bad?

We can try to imag­ine each pizza as hav­ing a nu­mer­i­cal score de­not­ing how much I like it. In that case, there’s no way we could as­sign con­sis­tent num­bers \(x, y, z\) to those three pizza top­pings such that \(x > y > z > x.\)

So maybe I don’t as­sign num­bers to my pizza. Why is that so awful?

Are there any grounds be­sides “we like a cer­tain math­e­mat­i­cal for­mal­ism and your choices don’t fit into our math,” on which crit­i­cize my three si­mul­ta­neous prefer­ences?

(Feel free to try to an­swer this your­self be­fore con­tin­u­ing…)

Sup­pose I tell you that I pre­fer pineap­ple to mush­rooms on my pizza. Sup­pose you’re about to give me a slice of mush­room pizza; but by pay­ing one penny ($$0.01$) I can in­stead get a slice of pineap­ple pizza (which is just as fresh from the oven). It seems re­al­is­tic to say that most peo­ple with a pineap­ple pizza prefer­ence would prob­a­bly pay the penny, if they hap­pened to have a penny in their pocket. %note: It could be that some­body’s pizza prefer­ence is real, but so weak that they wouldn’t pay one penny to get the pizza they pre­fer. In this case, imag­ine we’re talk­ing about some stronger prefer­ence in­stead. Like your will­ing­ness to pay at least one penny not to have your house burned down, or some­thing.%

After I pay the penny, though, and just be­fore I’m about to get the pineap­ple pizza, you offer me a slice of onion pizza in­stead—no charge for the change! If I was tel­ling the truth about prefer­ring onion pizza to pineap­ple, I should cer­tainly ac­cept the sub­sti­tu­tion if it’s free.

And then to round out the day, you offer me a mush­room pizza in­stead of the onion pizza, and again, since I pre­fer mush­rooms to onions, I ac­cept the swap.

I end up with ex­actly the same slice of mush­room pizza I started with… and one penny poorer, be­cause I pre­vi­ously paid $0.01 to swap mush­rooms for pineap­ple. <div><div>

This seems like a qual­i­ta­tively bad be­hav­ior on my part. By virtue of my in­co­her­ent prefer­ences which can­not be given a con­sis­tent or­der­ing, I have shot my­self in the foot, done some­thing self-defeat­ing. We haven’t said how I ought to sort out my in­con­sis­tent prefer­ences. But no mat­ter how it shakes out, it seems like there must be some bet­ter al­ter­na­tive—some bet­ter way I could rea­son that wouldn’t spend a penny to go in cir­cles. That is, I could at least have kept my origi­nal pizza slice and not spent the penny.

In a phrase you’re go­ing to keep hear­ing, I have ex­e­cuted a ‘dom­i­nated strat­egy’: there ex­ists some other strat­egy that does strictly bet­ter. noteThis does as­sume that the agent prefers to have more money rather than less money. “Ah, but why is it bad if one per­son has a penny in­stead of an­other?” you ask. If we in­sist on pin­ning down ev­ery point of this sort, then you can also imag­ine the $0.01 as stand­ing in for the time I burned in or­der to move the pizza slices around in cir­cles. That time was burned, and no­body else has it now. If I’m an effec­tive agent that goes around pur­su­ing my prefer­ences, I should in gen­eral be able to some­times con­vert time into other things that I want. In other words, my cir­cu­lar prefer­ence can lead me to in­cur an op­por­tu­nity cost de­nom­i­nated in the sac­ri­fice of other things I want, and not in a way that benefits any­one else.

Or as Steve Omo­hun­dro put it: If you pre­fer be­ing in Berkeley to be­ing in San Fran­cisco; pre­fer be­ing in San Jose to be­ing in Berkeley; and pre­fer be­ing in San Fran­cisco to be­ing in San Jose; then you’re go­ing to waste a lot of time on taxi rides.

None of this rea­son­ing has told us that a non-self-defeat­ing agent must pre­fer Berkeley to San Fran­cisco or vice versa. There are at least six pos­si­ble con­sis­tent or­der­ings over pizza top­pings, like \(\text{mushroom} >_P \text{pineapple} >_P \text{onion}\) etcetera, and any con­sis­tent or­der­ing would avoid pay­ing to go in cir­cles. noteThere are more than six pos­si­bil­ities if you think it’s pos­si­ble to be ab­solutely in­differ­ent be­tween two kinds of pizza. We have not, in this ar­gu­ment, used pure logic to de­rive that pineap­ple pizza must taste bet­ter than mush­room pizza to an ideal ra­tio­nal agent. But we’ve seen that elimi­nat­ing a cer­tain kind of shoot-your­self-in-the-foot be­hav­ior, cor­re­sponds to im­pos­ing a cer­tain co­her­ence or con­sis­tency re­quire­ment on what­ever prefer­ences are there.

It turns out that this is just one in­stance of a large fam­ily of co­her­ence the­o­rems which all end up point­ing at the same set of core prop­er­ties. All roads lead to Rome, and all the roads say, “If you are not shoot­ing your­self in the foot in sense X, we can view you as hav­ing co­her­ence prop­erty Y.”

There are some caveats to this gen­eral idea.

For ex­am­ple: In com­pli­cated prob­lems, perfect co­her­ence is usu­ally im­pos­si­ble to com­pute—it’s just too ex­pen­sive to con­sider all the pos­si­bil­ities.

But there are also caveats to the caveats! For ex­am­ple, it may be that if there’s a pow­er­ful ma­chine in­tel­li­gence that is not visi­bly to us hu­mans shoot­ing it­self in the foot in way X, then from our per­spec­tive it must look like the AI has co­her­ence prop­erty Y. If there’s some sense in which the ma­chine in­tel­li­gence is go­ing in cir­cles, be­cause not go­ing in cir­cles is too hard to com­pute, well, we won’t see that ei­ther with our tiny hu­man brains. In which case it may make sense, from our per­spec­tive, to think about the ma­chine in­tel­li­gence as if it has some co­her­ent prefer­ence or­der­ing.

We are not go­ing to go through all the co­her­ence the­o­rems in this in­tro­duc­tion. They form a very large fam­ily; some of them are a lot more math­e­mat­i­cally in­timi­dat­ing; and hon­estly I don’t know even 5% of the var­i­ants.

But we can hope­fully walk through enough co­her­ence the­o­rems to at least start to see the rea­son­ing be­hind, “Why ex­pected util­ity?” And, be­cause the two are a pack­age deal, “Why prob­a­bil­ity?”

Hu­man lives, mere dol­lars, and co­her­ent trades

An ex­per­i­ment in 2000--from a pa­per ti­tled “The Psy­chol­ogy of the Un­think­able: Ta­boo Trade-Offs, For­bid­den Base Rates, and Hereti­cal Coun­ter­fac­tu­als”—asked sub­jects to con­sider the dilemma of a hos­pi­tal ad­minis­tra­tor named Robert:

Robert can save the life of Johnny, a five year old who needs a liver trans­plant, but the trans­plant pro­ce­dure will cost the hos­pi­tal $1,000,000 that could be spent in other ways, such as pur­chas­ing bet­ter equip­ment and en­hanc­ing salaries to re­cruit tal­ented doc­tors to the hos­pi­tal. Johnny is very ill and has been on the wait­ing list for a trans­plant but be­cause of the short­age of lo­cal or­gan donors, ob­tain­ing a liver will be ex­pen­sive. Robert could save Johnny’s life, or he could use the $1,000,000 for other hos­pi­tal needs.

The main ex­per­i­men­tal re­sult was that most sub­jects got an­gry at Robert for even con­sid­er­ing the ques­tion.

After all, you can’t put a dol­lar value on a hu­man life, right?

But bet­ter hos­pi­tal equip­ment also saves lives, or at least one hopes so. noteWe can omit the ‘bet­ter doc­tors’ item from con­sid­er­a­tion: The sup­ply of doc­tors is mostly con­strained by reg­u­la­tory bur­dens and med­i­cal schools rather than the num­ber of peo­ple who want to be­come doc­tors; so bid­ding up salaries for doc­tors doesn’t much in­crease the to­tal num­ber of doc­tors; so bid­ding on a tal­ented doc­tor at one hos­pi­tal just means some other hos­pi­tal doesn’t get that tal­ented doc­tor. It’s also ille­gal to pay for livers, but let’s ig­nore that par­tic­u­lar is­sue with the prob­lem setup or pre­tend that it all takes place in a more sen­si­ble coun­try than the United States or Europe. It’s not like the other po­ten­tial use of the money saves zero lives.

Let’s say that Robert has a to­tal bud­get of $100,000,000 and is faced with a long list of op­tions such as these:

  • $100,000 for a new dial­y­sis ma­chine, which will save 3 lives

  • $1,000,000 for a liver for Johnny, which will save 1 life

  • $10,000 to train the nurses on proper hy­giene when in­sert­ing cen­tral lines, which will save an ex­pected 100 lives

Now sup­pose—this is a sup­po­si­tion we’ll need for our the­o­rem—that Robert does not care at all about money, not even a tiny bit. Robert only cares about max­i­miz­ing the to­tal num­ber of lives saved. Fur­ther­more, we sup­pose for now that Robert cares about ev­ery hu­man life equally.

If Robert does save as many lives as pos­si­ble, given his bounded money, then Robert must be­have like some­body as­sign­ing some con­sis­tent dol­lar value to sav­ing a hu­man life.

We should be able to look down the long list of op­tions that Robert took and didn’t take, and say, e.g., “Oh, Robert took all the op­tions that saved more than 1 life per $500,000 and re­jected all op­tions that saved less than 1 life per $500,000; so Robert’s be­hav­ior is con­sis­tent with his spend­ing $500,000 per life.”

Alter­na­tively, if we can’t view Robert’s be­hav­ior as be­ing co­her­ent in this sense—if we can­not make up any dol­lar value of a hu­man life, such that Robert’s choices are con­sis­tent with that dol­lar value—then it must be pos­si­ble to move around the same amount of money, in a way that saves more lives.

We start from the qual­i­ta­tive crite­rion, “Robert must save as many lives as pos­si­ble; it shouldn’t be pos­si­ble to move around the same money to save more lives”. We end up with the quan­ti­ta­tive co­her­ence the­o­rem, “It must be pos­si­ble to view Robert as trad­ing dol­lars for lives at a con­sis­tent price.”

We haven’t proven that dol­lars have some in­trin­sic worth that trades off against the in­trin­sic worth of a hu­man life. By hy­poth­e­sis, Robert doesn’t care about money at all. It’s just that ev­ery dol­lar has an op­por­tu­nity cost in lives it could have saved if de­ployed differ­ently; and this op­por­tu­nity cost is the same for ev­ery dol­lar be­cause money is fun­gible.

An im­por­tant caveat to this the­o­rem is that there may be, e.g., an op­tion that saves a hun­dred thou­sand lives for $200,000,000. But Robert only has $100,000,000 to spend. In this case, Robert may fail to take that op­tion even though it saves 1 life per $2,000. It was a good op­tion, but Robert didn’t have enough money in the bank to af­ford it. This does mess up the el­e­gance of be­ing able to say, “Robert must have taken all the op­tions sav­ing at least 1 life per $500,000”, and in­stead we can only say this with re­spect to op­tions that are in some sense small enough or gran­u­lar enough.

Similarly, if an op­tion costs $5,000,000 to save 15 lives, but Robert only has $4,000,000 left over af­ter tak­ing all his other best op­por­tu­ni­ties, Robert’s last se­lected op­tion might be to save 8 lives for $4,000,000 in­stead. This again messes up the el­e­gance of the rea­son­ing, but Robert is still do­ing ex­actly what an agent would do if it con­sis­tently val­ued lives at 1 life per $500,000--it would buy all the best op­tions it could af­ford that pur­chased at least that many lives per dol­lar. So that part of the the­o­rem’s con­clu­sion still holds.

Another caveat is that we haven’t proven that there’s some spe­cific dol­lar value in Robert’s head, as a mat­ter of psy­chol­ogy. We’ve only proven that Robert’s out­ward be­hav­ior can be viewed as if it prices lives at some con­sis­tent value, as­sum­ing Robert saves as many lives as pos­si­ble.

It could be that Robert ac­cepts ev­ery op­tion that spends less than $500,000/​life and re­jects ev­ery op­tion that spends over $600,000, and there aren’t any available op­tions in the mid­dle. Then Robert’s be­hav­ior can equally be viewed as con­sis­tent with a price of $510,000 or a price of $590,000. This helps show that we haven’t proven any­thing about Robert ex­plic­itly think­ing of some num­ber. Maybe Robert never lets him­self think of a spe­cific thresh­old value, be­cause it would be taboo to as­sign a dol­lar value to hu­man life; and in­stead Robert just fid­dles the choices un­til he can’t see how to save any more lives.

We nat­u­rally have not proved by pure logic that Robert must want, in the first place, to save as many lives as pos­si­ble. Even if Robert is a good per­son, this doesn’t fol­low. Maybe Robert val­ues a 10-year-old’s life at 5 times the value of a 70-year-old’s life, so that Robert will sac­ri­fice five grand­par­ents to save one 10-year-old. A lot of peo­ple would see that as en­tirely con­sis­tent with valu­ing hu­man life in gen­eral.

Let’s con­sider that last idea more thor­oughly. If Robert con­sid­ers a pre­teen equally valuable with 5 grand­par­ents, so that Robert will shift $100,000 from sav­ing 8 old peo­ple to sav­ing 2 chil­dren, then we can no longer say that Robert wants to save as many ‘lives’ as pos­si­ble. That last de­ci­sion would de­crease by 6 the to­tal num­ber of ‘lives’ saved. So we can no longer say that there’s a qual­i­ta­tive crite­rion, ‘Save as many lives as pos­si­ble’, that pro­duces the quan­ti­ta­tive co­her­ence re­quire­ment, ‘trade dol­lars for lives at a con­sis­tent rate’.

Does this mean that co­her­ence might as well go out the win­dow, so far as Robert’s be­hav­ior is con­cerned? Any­thing goes, now? Just spend money wher­ever?

“Hm,” you might think. “But… if Robert trades 8 old peo­ple for 2 chil­dren here… and then trades 1 child for 2 old peo­ple there…”

To re­duce dis­trac­tion, let’s make this prob­lem be about ap­ples and or­anges in­stead. Sup­pose:

  • Alice starts with 8 ap­ples and 1 or­ange.

  • Then Alice trades 8 ap­ples for 2 or­anges.

  • Then Alice trades away 1 or­ange for 2 ap­ples.

  • Fi­nally, Alice trades an­other or­ange for 3 ap­ples.

Then in this ex­am­ple, Alice is us­ing a strat­egy that’s strictly dom­i­nated across all cat­e­gories of fruit. Alice ends up with 5 ap­ples and one or­ange, but could’ve ended with 8 ap­ples and one or­ange (by not mak­ing any trades at all). Re­gard­less of the rel­a­tive value of ap­ples and or­anges, Alice’s strat­egy is do­ing qual­i­ta­tively worse than an­other pos­si­ble strat­egy, if ap­ples have any pos­i­tive value to her at all.

So the fact that Alice can’t be viewed as hav­ing any co­her­ent rel­a­tive value for ap­ples and or­anges, cor­re­sponds to her end­ing up with qual­i­ta­tively less of some cat­e­gory of fruit (with­out any cor­re­spond­ing gains el­se­where).

This re­mains true if we in­tro­duce more kinds of fruit into the prob­lem. Let’s say the set of fruits Alice can trade in­cludes {ap­ples, or­anges, straw­ber­ries, plums}. If we can’t look at Alice’s trades and make up some rel­a­tive quan­ti­ta­tive val­ues of fruit, such that Alice could be trad­ing con­sis­tently with re­spect to those val­ues, then Alice’s trad­ing strat­egy must have been dom­i­nated by some other strat­egy that would have ended up with strictly more fruit across all cat­e­gories.

In other words, we need to be able to look at Alice’s trades, and say some­thing like:

“Maybe Alice val­ues an or­ange at 2 ap­ples, a straw­berry at 0.1 ap­ples, and a plum at 0.5 ap­ples. That would ex­plain why Alice was will­ing to trade 4 straw­ber­ries for a plum, but not will­ing to trade 40 straw­ber­ries for an or­ange and an ap­ple.”

And if we can’t say this, then there must be some way to re­ar­range Alice’s trades and get strictly more fruit across all cat­e­gories in the sense that, e.g., we end with the same num­ber of plums and ap­ples, but one more or­ange and two more straw­ber­ries. This is a bad thing if Alice qual­i­ta­tively val­ues fruit from each cat­e­gory—prefers hav­ing more fruit to less fruit, ce­teris paribus, for each cat­e­gory of fruit.

Now let’s shift our at­ten­tion back to Robert the hos­pi­tal ad­minis­tra­tor. Either we can view Robert as con­sis­tently as­sign­ing some rel­a­tive value of life for 10-year-olds vs. 70-year-olds, or there must be a way to re­ar­range Robert’s ex­pen­di­tures to save ei­ther strictly more 10-year-olds or strictly more 70-year-olds. The same logic ap­plies if we add 50-year-olds to the mix. We must be able to say some­thing like, “Robert is con­sis­tently be­hav­ing as if a 50-year-old is worth a third of a ten-year-old”. If we can’t say that, Robert must be be­hav­ing in a way that pointlessly dis­cards some save­able lives in some cat­e­gory.

Or per­haps Robert is be­hav­ing in a way which im­plies that 10-year-old girls are worth more than 10-year-old boys. But then the rel­a­tive val­ues of those sub­classes 10-year-olds need to be vie­w­able as con­sis­tent; or else Robert must be qual­i­ta­tively failing to save one more 10-year-old boy than could’ve been saved oth­er­wise.

If you can de­nom­i­nate ap­ples in or­anges, and price or­anges in plums, and trade off plums for straw­ber­ries, all at con­sis­tent rates… then you might as well take it one step fur­ther, and fac­tor out an ab­stract unit for ease of no­ta­tion.

Let’s call this unit 1 utilon, and de­note it €1. (As we’ll see later, the let­ters ‘EU’ are ap­pro­pri­ate here.)

If we say that ap­ples are worth €1, or­anges are worth €2, and plums are worth €0.5, then this tells us the rel­a­tive value of ap­ples, or­anges, and plums. Con­versely, if we can as­sign con­sis­tent rel­a­tive val­ues to ap­ples, or­anges, and plums, then we can fac­tor out an ab­stract unit at will—for ex­am­ple, by ar­bi­trar­ily declar­ing ap­ples to be worth €100 and then calcu­lat­ing ev­ery­thing else’s price in ap­ples.

Have we proven by pure logic that all ap­ples have the same util­ity? Of course not; you can pre­fer some par­tic­u­lar ap­ples to other par­tic­u­lar ap­ples. But when you’re done say­ing which things you qual­i­ta­tively pre­fer to which other things, if you go around mak­ing trade­offs in way that can be viewed as not qual­i­ta­tively leav­ing be­hind some things you said you wanted, we can view you as as­sign­ing co­her­ent quan­ti­ta­tive util­ities to ev­ery­thing you want.

And that’s one co­her­ence the­o­rem—among oth­ers—that can be seen as mo­ti­vat­ing the con­cept of util­ity in de­ci­sion the­ory.

Utility isn’t a solid thing, a sep­a­rate thing. We could mul­ti­ply all the util­ities by two, and that would cor­re­spond to the same out­ward be­hav­iors. It’s mean­ingless to ask how much util­ity you scored at the end of your life, be­cause we could sub­tract a mil­lion or add a mil­lion to that quan­tity while leav­ing ev­ery­thing else con­cep­tu­ally the same.

You could pick any­thing you val­ued—say, the joy of watch­ing a cat chase a laser poin­ter for 10 sec­onds—and de­nom­i­nate ev­ery­thing rel­a­tive to that, with­out need­ing any con­cept of an ex­tra ab­stract ‘util­ity’. So (just to be ex­tremely clear about this point) we have not proven that there is a sep­a­rate thing ‘util­ity’ that you should be pur­su­ing in­stead of ev­ery­thing else you wanted in life.

The co­her­ence the­o­rem says noth­ing about which things to value more than oth­ers, or how much to value them rel­a­tive to other things. It doesn’t say whether you should value your hap­piness more than some­one else’s hap­piness, any more than the no­tion of a con­sis­tent prefer­ence or­der­ing \(>_P\) tells us whether \(\text{onions} >_P \text{pineapple}.\)

(The no­tion that we should as­sign equal value to all hu­man lives, or equal value to all sen­tient lives, or equal value to all Qual­ity-Ad­justed Life Years, is util­i­tar­i­anism. Which is, sorry about the con­fu­sion, a whole ’nother sep­a­rate differ­ent philos­o­phy.)

The con­cep­tual gizmo that maps thin­gies to util­ities—the whatchamacal­lit that takes in a fruit and spits out a util­ity—is called a ‘util­ity func­tion’. Again, this isn’t a sep­a­rate thing that’s writ­ten on a stone tablet. If we mul­ti­ply a util­ity func­tion by 9.2, that’s con­cep­tu­ally the same util­ity func­tion be­cause it’s con­sis­tent with the same set of be­hav­iors.

But in gen­eral: If we can sen­si­bly view any agent as do­ing as well as qual­i­ta­tively pos­si­ble at any­thing, we must be able to view the agent’s be­hav­ior as con­sis­tent with there be­ing some co­her­ent rel­a­tive quan­tities of want­ed­ness for all the thin­gies it’s try­ing to op­ti­mize.

Prob­a­bil­ities and ex­pected utility

We’ve so far made no men­tion of prob­a­bil­ity. But the way that prob­a­bil­ities and util­ities in­ter­act, is where we start to see the full struc­ture of ex­pected util­ity spotlighted by all the co­her­ence the­o­rems.

The ba­sic no­tion in ex­pected util­ity is that some choices pre­sent us with un­cer­tain out­comes.

For ex­am­ple, I come to you and say: “Give me 1 ap­ple, and I’ll flip a coin; if the coin lands heads, I’ll give you 1 or­ange; if the coin comes up tails, I’ll give you 3 plums.” Sup­pose you rel­a­tively value fruits as de­scribed ear­lier: 2 ap­ples /​ or­ange and 0.5 ap­ples /​ plum. Then ei­ther pos­si­ble out­come gives you some­thing that’s worth more to you than 1 ap­ple. Turn­ing down a so-called ‘gam­ble’ like that… why, it’d be a dom­i­nated strat­egy.

In gen­eral, the no­tion of ‘ex­pected util­ity’ says that we as­sign cer­tain quan­tities called prob­a­bil­ities to each pos­si­ble out­come. In the ex­am­ple above, we might as­sign a ‘prob­a­bil­ity’ of \(0.5\) to the coin land­ing heads (1 or­ange), and a ‘prob­a­bil­ity’ of \(0.5\) to the coin land­ing tails (3 plums). Then the to­tal value of the ‘gam­ble’ we get by trad­ing away 1 ap­ple is:

$$\mathbb P(heads) \cdot U(\text{1 orange}) + \mathbb P(tails) \cdot U(\text{3 plums}) \\ = 0.50 \cdot €2 + 0.50 \cdot €1.5 = €1.75$$

Con­versely, if we just keep our 1 ap­ple in­stead of mak­ing the trade, this has an ex­pected utilty of \(1 \cdot U(\text{1 apple}) = €1.\) So in­deed we ought to trade (as the pre­vi­ous rea­son­ing sug­gested).

“But wait!” you cry. “Where did these prob­a­bil­ities come from? Why is the ‘prob­a­bil­ity’ of a fair coin land­ing heads \(0.5\) and not, say, \(-0.2\) or \(3\)? Who says we ought to mul­ti­ply util­ities by prob­a­bil­ities in the first place?”

If you’re used to ap­proach­ing this prob­lem from a Bayesian stand­point, then you may now be think­ing of no­tions like prior prob­a­bil­ity and Oc­cam’s Ra­zor and uni­ver­sal pri­ors

But from the stand­point of co­her­ence the­o­rems, that’s putting the cart be­fore the horse.

From the stand­point of co­her­ence the­o­rems, we don’t start with a no­tion of ‘prob­a­bil­ity’.

In­stead we ought to prove some­thing along the lines of: if you’re not us­ing qual­i­ta­tively dom­i­nated strate­gies, then you must be­have as if you are mul­ti­ply­ing util­ities by cer­tain quan­ti­ta­tive thin­gies.

We might then fur­ther­more show that, for non-dom­i­nated strate­gies, these util­ity-mul­ti­ply­ing thin­gies must be be­tween \(0\) and \(1\) rather than say \(-0.3\) or \(27.\)

Hav­ing de­ter­mined what co­her­ence prop­er­ties these util­ity-mul­ti­ply­ing thin­gies need to have, we de­cide to call them ‘prob­a­bil­ities’. And then—once we know in the first place that we need ‘prob­a­bil­ities’ in or­der to not be us­ing dom­i­nated strate­gies—we can start to worry about ex­actly what the num­bers ought to be.

Prob­a­bil­ities sum­ming to 1

Here’s a taste of the kind of rea­son­ing we might do:

Sup­pose that—hav­ing already ac­cepted some pre­vi­ous proof that non-dom­i­nated strate­gies deal­ing with un­cer­tain out­comes, must mul­ti­ply util­ities by quan­ti­ta­tive thin­gies—you then say that you are go­ing to as­sign a prob­a­bil­ity of \(0.6\) to the coin com­ing up heads, and a prob­a­bil­ity of \(0.7\) to the coin com­ing up tails.

If you’re already used to the stan­dard no­tion of prob­a­bil­ity, you might ob­ject, “But those prob­a­bil­ities sum to \(1.3\) when they ought to sum to \(1!\)noteOr maybe a tiny bit less than \(1,\) in case the coin lands on its edge or some­thing. But now we are in co­her­ence-land; we don’t ask “Did we vi­o­late the stan­dard ax­ioms that all the text­books use?” but “What rules must non-dom­i­nated strate­gies obey?” De gustibus non est dis­putan­dum; can we dis­putan­dum some­body say­ing that a coin has a 60% prob­a­bil­ity of com­ing up heads and a 70% prob­a­bil­ity of com­ing up tails? (Where these are the only 2 pos­si­ble out­comes of an un­cer­tain coin­flip.)

Well—as­sum­ing you’ve already ac­cepted that we need util­ity-mul­ti­ply­ing thin­gies—I might then offer you a gam­ble. How about you give me one ap­ple, and if the coin lands heads, I’ll give you 0.8 ap­ples; while if the coin lands tails, I’ll give you 0.8 ap­ples.

Ac­cord­ing to you, the ex­pected util­ity of this gam­ble is:

$$\mathbb P(\text{heads}) \cdot U(\text{0.8 apples}) + \mathbb P(\text{tails}) \cdot U(\text{0.8 apples}) \\ = 0.6 \cdot €0.8 + 0.7 \cdot €0.8 = €1.04.$$

You’ve just de­cided to trade your ap­ple for 0.8 ap­ples, which sure sounds like one of ’em dom­i­nated strate­gies.

And that’s why the thin­gies you mul­ti­ply prob­a­bil­ities by—the thin­gies that you use to weight un­cer­tain out­comes in your imag­i­na­tion, when you’re try­ing to de­cide how much you want one branch of an un­cer­tain choice—must sum to 1, whether you call them ‘prob­a­bil­ities’ or not.

Well… ac­tu­ally we just ar­gued noteNoth­ing we’re walk­ing through here is re­ally a co­her­ence the­o­rem per se, more like in­tu­itive ar­gu­ments that a co­her­ence the­o­rem ought to ex­ist. The­o­rems re­quire proofs, and noth­ing is here is what real math­e­mat­i­ci­ans would con­sider to be a ‘proof’. that prob­a­bil­ities for mu­tu­ally ex­clu­sive out­comes should sum to no more than 1. What would be an ex­am­ple show­ing that, for non-dom­i­nated strate­gies, the prob­a­bil­ities for ex­haus­tive out­comes should sum to no less than 1?

Sup­pose that, in ex­change for 1 ap­ple, I cred­ibly offer:

  • To pay you 1.1 ap­ples if a coin comes up heads.

  • To pay you 1.1 ap­ples if a coin comes up tails.

  • To pay you 1.1 ap­ples if any­thing else hap­pens.

If the prob­a­bil­ities you as­sign to these three out­comes sum to say 0.9, you will re­fuse to trade 1 ap­ple for 1.1 ap­ples.

(This is strictly dom­i­nated by the strat­egy of agree­ing to trade 1 ap­ple for 1.1 ap­ples.) <div><div>

Dutch book arguments

Another way we could have pre­sented es­sen­tially the same ar­gu­ment as above, is as fol­lows:

Sup­pose you are a mar­ket-maker in a pre­dic­tion mar­ket for some event \(X.\) When you say that your price for event \(X\) is \(x\), you mean that you will sell for \(\$x\) a ticket which pays \(\$1\) if \(X\) hap­pens (and pays out noth­ing oth­er­wise). In fact, you will sell any num­ber of such tick­ets!

Since you are a mar­ket-maker (that is, you are try­ing to en­courage trad­ing in \(X\) for what­ever rea­son), you are also will­ing to buy any num­ber of tick­ets at the price \(\$x.\) That is, I can say to you (the mar­ket-maker) “I’d like to sign a con­tract where you give me \(N \cdot \$x\) now, and in re­turn I must pay you \(\$N\) iff \(X\) hap­pens;” and you’ll agree. (We can view this as you sel­l­ing me a nega­tive num­ber of the origi­nal kind of ticket.)

Let \(X\) and \(Y\) de­note two events such that ex­actly one of them must hap­pen; say, \(X\) is a coin land­ing heads and \(Y\) is the coin not land­ing heads.

Now sup­pose that you, as a mar­ket-maker, are mo­ti­vated to avoid com­bi­na­tions of bets that lead into cer­tain losses for you—not just losses that are merely prob­a­ble, but com­bi­na­tions of bets such that ev­ery pos­si­bil­ity leads to a loss.

Then if ex­actly one of \(X\) and \(Y\) must hap­pen, your prices \(x\) and \(y\) must sum to ex­actly \(\$1.\) Be­cause:

  • If \(x + y < \$1,\) I buy both an \(X\)-ticket and a \(Y\)-ticket and get a guaran­teed pay­out of \(\$1\) minus costs of \(x + y.\) Since this is a guaran­teed profit for me, it is a guaran­teed loss for you.

  • If \(x + y > \$1,\) I sell you both tick­ets and will at the end pay you \(\$1\) af­ter you have already paid me \(x + y.\) Again, this is a guaran­teed profit for me of \(x + y - \$1 > \$0.\)

This is more or less ex­actly the same ar­gu­ment as in the pre­vi­ous sec­tion, with trad­ing ap­ples. Ex­cept that: (a) the sce­nario is more crisp, so it is eas­ier to gen­er­al­ize and scale up much more com­pli­cated similar ar­gu­ments; and (b) it in­tro­duces a whole lot of as­sump­tions that peo­ple new to ex­pected util­ity would prob­a­bly find rather ques­tion­able.

“What?” one might cry. “What sort of crazy bookie would buy and sell bets at ex­actly the same price? Why ought any­one to buy and sell bets at ex­actly the same price? Who says that I must value a gain of $1 ex­actly the op­po­site of a loss of $1? Why should the price that I put on a bet rep­re­sent my de­gree of un­cer­tainty about the en­vi­ron­ment? What does all of this ar­gu­ment about gam­bling have to do with real life?”

So again, the key idea is not that we are as­sum­ing any­thing about peo­ple valu­ing ev­ery real-world dol­lar the same; nor is it in real life a good idea to offer to buy or sell bets at the same prices. noteIn real life this leads to a prob­lem of ‘ad­ver­sar­ial se­lec­tion’ where some­body who knows more about the en­vi­ron­ment than you, can de­cide whether to buy or sell from you. To put it an­other way, from a Bayesian stand­point, if an in­tel­li­gent coun­ter­party is de­cid­ing whether to buy or sell from you a bet on \(X\), the fact that they choose to buy (or sell) should cause you to up­date in fa­vor (or against) \(X\) ac­tu­ally hap­pen­ing. After all, they wouldn’t be tak­ing the bet un­less they thought they knew some­thing you didn’t! Rather, Dutch book ar­gu­ments can stand in as short­hand for some longer story in which we only as­sume that you pre­fer more ap­ples to less ap­ples.

The Dutch book ar­gu­ment above has to be seen as one more added piece in the com­pany of all the other co­her­ence the­o­rems—for ex­am­ple, the co­her­ence the­o­rems sug­gest­ing that you ought to be quan­ti­ta­tively weigh­ing events in your mind in the first place.

Con­di­tional probability

With more com­pli­cated Dutch book ar­gu­ments, we can de­rive more com­pli­cated ideas such as ‘con­di­tional prob­a­bil­ity’.

Let’s say that we’re pric­ing three kinds of gam­bles over two events \(Q\) and \(R\):

  • A ticket that costs \(\$x\), and pays \(\$1\) if \(Q\) hap­pens.

  • A ticket that doesn’t cost any­thing or pay any­thing if \(Q\) doesn’t hap­pen (the ticket price is re­funded); and if \(Q\) does hap­pen, this ticket costs \(\$y,\) then pays \(\$1\) if \(R\) hap­pens.

  • A ticket that costs \(\$z\), and pays \(\$1\) if \(Q\) and \(R\) both hap­pen.

In­tu­itively, the idea of con­di­tional prob­a­bil­ity is that the prob­a­bil­ity of \(Q\) and \(R\) both hap­pen­ing, should be equal to the prob­a­bil­ity of \(Q\) hap­pen­ing, times the prob­a­bil­ity that \(R\) hap­pens as­sum­ing that \(Q\) hap­pens:

$$\mathbb P(Q \wedge R) = \mathbb P(Q) \cdot \mathbb P(R \mid Q)$$

To ex­hibit a Dutch book ar­gu­ment for this rule, we want to start from the as­sump­tion of a qual­i­ta­tively non-dom­i­nated strat­egy, and de­rive the quan­ti­ta­tive rule \(z = x \cdot y.\)

So let’s give an ex­am­ple that vi­o­lates this equa­tion and see if there’s a way to make a guaran­teed profit. Let’s say some­body:

  • Prices at x=$0.60 the first ticket, aka \(\mathbb P(Q)\)

  • Prices at y=$0.70 the sec­ond ticket, aka \(\mathbb P(R \mid Q)\)

  • Prices at z=$0.20 the third ticket, aka \(\mathbb P(Q \wedge R),\) which ought to be $0.42 as­sum­ing the first two prices.

The first two tick­ets are priced rel­a­tively high, com­pared to the third ticket which is priced rel­a­tively low, sug­gest­ing that we ought to sell the first two tick­ets and buy the third.

Okay, let’s ask what hap­pens if we sell 10 of the first ticket, sell 10 of the sec­ond ticket, and buy 10 of the third ticket.

  • If \(Q\) doesn’t hap­pen, we get $6, and pay $2. Net +$4.

  • If \(Q\) hap­pens and \(R\) doesn’t hap­pen, we get $6, pay $10, get $7, and pay $2. Net +$1.

  • If \(Q\) hap­pens and \(R\) hap­pens, we get $6, pay $10, get $7, pay $10, pay $2, and get $10. Net: +$1.

That is: we can get a guaran­teed pos­i­tive profit over all three pos­si­ble out­comes.

More gen­er­ally, let \(A, B, C\) be the (po­ten­tially nega­tive) amount of each ticket \(X, Y, Z\) that is be­ing bought (buy­ing a nega­tive amount is sel­l­ing). Then the prices \(x, y, z\) can be com­bined into a ‘Dutch book’ when­ever the fol­low­ing three in­equal­ities can be si­mul­ta­neously true, with at least one in­equal­ity strict:

$$\begin{array}{rrrl} -Ax & + 0 & - Cz & \geqq 0 \\ A(1-x) & - By & - Cz & \geqq 0 \\ A(1-x) & + B(1-y) & + C(1-z) & \geqq 0 \end{array}$$

For \(x, y, z \in (0..1)\) this is im­pos­si­ble ex­actly iff \(z = x \* y.\) The proof via a bunch of alge­bra is left as an ex­er­cise to the reader. noteThe quick but ad­vanced ar­gu­ment would be to say that the left-hand-side must look like a sin­gu­lar ma­trix, whose de­ter­mi­nant must there­fore be zero.

The Allais Paradox

By now, you’d prob­a­bly like to see a glimpse of the sort of ar­gu­ment that shows in the first place that we need ex­pected util­ity—that a non-dom­i­nated strat­egy for un­cer­tain choice must be­have as if mul­ti­ply­ing util­ities by some kinda util­ity-mul­ti­ply­ing thin­gies (‘prob­a­bil­ities’).

As far as I un­der­stand it, the real ar­gu­ment you’re look­ing for is Abra­ham Wald’s com­plete class the­o­rem, which I must con­fess I don’t know how to re­duce to a sim­ple demon­stra­tion.

But we can catch a glimpse of the gen­eral idea from a fa­mous psy­chol­ogy ex­per­i­ment that be­came known as the Allais Para­dox (in slightly adapted form).

Sup­pose you ask some ex­per­i­men­tal sub­jects which of these gam­bles they would rather play:

  • 1A: A cer­tainty of $1,000,000.

  • 1B: 90% chance of win­ning $5,000,000, 10% chance of win­ning noth­ing.

Most sub­jects say they’d pre­fer 1A to 1B.

Now ask a sep­a­rate group of sub­jects which of these gam­bles they’d pre­fer:

  • 2A: 50% chance of win­ning $1,000,000; 50% chance of win­ning $0.

  • 2B: 45% chance of win­ning $5,000,000; 55% chance of win­ning $0.

In this case, most sub­jects say they’d pre­fer gam­ble 2B.

Note that the $ sign here de­notes real dol­lars, not util­ities! A gain of five mil­lion dol­lars isn’t, and shouldn’t be, worth ex­actly five times as much to you as a gain of one mil­lion dol­lars. We can use the € sym­bol to de­note the ex­pected util­ities that are ab­stracted from how much you rel­a­tively value differ­ent out­comes; $ is just money.

So we cer­tainly aren’t claiming that the first prefer­ence is para­dox­i­cal be­cause 1B has an ex­pected dol­lar value of $4.5 mil­lion and 1A has an ex­pected dol­lar value of $1 mil­lion. That would be silly. We care about ex­pected util­ities, not ex­pected dol­lar val­ues, and those two con­cepts aren’t the same at all!

Nonethe­less, the com­bined prefer­ences 1A > 1B and 2A < 2B are not com­pat­i­ble with any co­her­ent util­ity func­tion. We can­not si­mul­ta­neously have:

$$\begin{array}{rcl} U(\text{gain \$1 million}) & > & 0.9 \cdot U(\text{gain \$5 million}) + 0.1 \cdot U(\text{gain \$0}) \\ 0.5 \cdot U(\text{gain \$0}) + 0.5 \cdot U(\text{gain \$1 million}) & > & 0.45 \cdot U(\text{gain \$5 million}) + 0.55 \cdot U(\text{gain \$0}) \end{array}$$

This was one of the ear­liest ex­per­i­ments seem­ing to demon­strate that ac­tual hu­man be­ings were not ex­pected util­ity max­i­miz­ers—a very tame idea nowa­days, to be sure, but the first definite demon­stra­tion of that was a big deal at the time. Hence the term, “Allais Para­dox”.

Now by the gen­eral idea be­hind co­her­ence the­o­rems, since we can’t view this be­hav­ior as cor­re­spond­ing to ex­pected util­ities, we ought to be able to show that it cor­re­sponds to a dom­i­nated strat­egy some­how—de­rive some way in which this be­hav­ior cor­re­sponds to shoot­ing off your own foot.

In this case, the rele­vant idea seems non-ob­vi­ous enough that it doesn’t seem rea­son­able to de­mand that you think of it on your own; but if you like, you can pause and try to think of it any­way. Other­wise, just con­tinue read­ing.

Again, the gam­bles are as fol­lows:

  • 1A: A cer­tainty of $1,000,000.

  • 1B: 90% chance of win­ning $5,000,000, 10% chance of win­ning noth­ing.

  • 2A: 50% chance of win­ning $1,000,000; 50% chance of win­ning $0.

  • 2B: 45% chance of win­ning $5,000,000; 55% chance of win­ning $0.

Now ob­serve that Sce­nario 2 cor­re­sponds to a 50% chance of play­ing Sce­nario 1, and oth­er­wise get­ting $0.

This, in fact, is why the com­bi­na­tion 1A > 1B; 2A < 2B is in­com­pat­i­ble with ex­pected util­ity. In terms of one set of ax­ioms fre­quently used to de­scribe ex­pected util­ity, it vi­o­lates the In­de­pen­dence Ax­iom: if a gam­ble \(L\) is preferred to \(M\), that is \(L > M\), then we ought to be able to take a con­stant prob­a­bil­ity \(p > 0\) and an­other gam­ble \(N\) and have \(p \cdot L + (1-p)\cdot N > p \cdot M + (1-p) \cdot N.\)

To put it an­other way, if I flip a coin to de­cide whether or not to play some en­tirely differ­ent game \(N,\) but oth­er­wise let you choose \(L\) or \(M,\) you ought to make the same choice as if I just ask you whether you pre­fer \(L\) or \(M\). Your prefer­ence be­tween \(L\) and \(M\) should be ‘in­de­pen­dent’ of the pos­si­bil­ity that, in­stead of do­ing any­thing what­so­ever with \(L\) or \(M,\) we will do some­thing else in­stead.

And since this is an ax­iom of ex­pected util­ity, any vi­o­la­tion of that ax­iom ought to cor­re­spond to a dom­i­nated strat­egy some­how.

In the case of the Allais Para­dox, we do the fol­low­ing:

First, I show you a switch that can be set to A or B, cur­rently set to A.

In one minute, I tell you, I will flip a coin. If the coin comes up heads, you will get noth­ing. If the coin comes up tails, you will play the gam­ble from Sce­nario 1.

From your cur­rent per­spec­tive, that is, we are play­ing Sce­nario 2: since the switch is set to A, you have a 50% chance of get­ting noth­ing and a 50% chance of get­ting $1 mil­lion.

I ask you if you’d like to pay a penny to throw the switch from A to B. Since you pre­fer gam­ble 2B to 2A, and some quite large amounts of money are at stake, you agree to pay the penny. From your per­spec­tive, you now have a 55% chance of end­ing up with noth­ing and a 45% chance of get­ting $5M.

I then flip the coin, and luck­ily for you, it comes up tails.

From your per­spec­tive, you are now in Sce­nario 1B. Hav­ing ob­served the coin and up­dated on its state, you now think you have a 90% chance of get­ting $5 mil­lion and a 10% chance of get­ting noth­ing. By hy­poth­e­sis, you would pre­fer a cer­tainty of $1 mil­lion.

So I offer you a chance to pay an­other penny to flip the switch back from B to A. And with so much money at stake, you agree.

I have taken your two cents on the sub­ject.

That is: You paid a penny to flip a switch and then paid an­other penny to switch it back, and this is dom­i­nated by the strat­egy of just leav­ing the switch set to A.

And that’s at least a glimpse of why, if you’re not us­ing dom­i­nated strate­gies, the thing you do with rel­a­tive util­ities is mul­ti­ply them by prob­a­bil­ities in a con­sis­tent way, and pre­fer the choice that leads to a greater ex­pec­ta­tion of the vari­able rep­re­sent­ing util­ity.

From the Allais Para­dox to real life

The real-life les­son about what to do when faced with Allais’s dilemma might be some­thing like this:

There’s some amount that $1 mil­lion would im­prove your life com­pared to $0.

There’s some amount that an ad­di­tional $4 mil­lion would fur­ther im­prove your life af­ter the first $1 mil­lion.

You ought to vi­su­al­ize these two im­prove­ments as best you can, and de­cide whether an­other $4 mil­lion can pro­duce at least one-ninth as much im­prove­ment, as much true value to you, as the first $1 mil­lion.

If it can, you should con­sis­tently pre­fer 1B > 1A; 2B > 2A. And if not, you should con­sis­tently pre­fer 1A > 1B; 2A > 2B.

The stan­dard ‘para­dox­i­cal’ prefer­ences in Allais’s ex­per­i­ment are stan­dardly at­tributed to a cer­tainty effect: peo­ple value the cer­tainty of hav­ing $1 mil­lion, while the differ­ence be­tween a 50% prob­a­bil­ity and a 55% prob­a­bil­ity looms less large. (And this ties in to a num­ber of other re­sults about cer­tainty, need for clo­sure, prospect the­ory, and so on.)

It may sound in­tu­itive, in an Allais-like sce­nario, to say that you ought to de­rive some value from be­ing cer­tain about the out­come. In fact this is just the rea­son­ing the ex­per­i­ment shows peo­ple to be us­ing, so of course it might sound in­tu­itive. But that does, in­escapably, cor­re­spond to a kind of think­ing that pro­duces dom­i­nated strate­gies.

One pos­si­ble ex­cuse might be that cer­tainty is valuable if you need to make plans about the fu­ture; know­ing the ex­act fu­ture lets you make bet­ter plans. This is ad­mit­tedly true and a phe­nomenon within ex­pected util­ity, though it ap­plies in a smooth way as con­fi­dence in­creases rather than jump­ing sud­denly around 100%. But in the par­tic­u­lar dilemma as de­scribed here, you only have 1 minute be­fore the game is played, and no time to make other ma­jor life choices de­pen­dent on the out­come.

Another pos­si­ble ex­cuse for cer­tainty bias might be to say: “Well, I value the emo­tional feel­ing of cer­tainty.”

In real life, we do have emo­tions that are di­rectly about prob­a­bil­ities, and those lit­tle flashes of hap­piness or sad­ness are worth some­thing if you care about peo­ple be­ing happy or sad. If you say that you value the emo­tional feel­ing of be­ing cer­tain of get­ting $1 mil­lion, the free­dom from the fear of get­ting $0, for the minute that the dilemma lasts and you are ex­pe­rienc­ing the emo­tion—well, that may just be a fact about what you value, even if it ex­ists out­side the ex­pected util­ity for­mal­ism.

And this gen­uinely does not fit into the ex­pected util­ity for­mal­ism. In an ex­pected util­ity agent, prob­a­bil­ities are just thin­gies-you-mul­ti­ply-util­ities-by. If those thin­gies start gen­er­at­ing their own util­ities once rep­re­sented in­side the mind of per­son who is an ob­ject of eth­i­cal value, you re­ally are go­ing to get re­sults that are in­com­pat­i­ble with the for­mal de­ci­sion the­ory.

How­ever, not be­ing vie­w­able as an ex­pected util­ity agent does always cor­re­spond to em­ploy­ing dom­i­nated strate­gies. You are giv­ing up some­thing in ex­change, if you pur­sue that feel­ing of cer­tainty. You are po­ten­tially los­ing all the real value you could have gained from an­other $4 mil­lion, if that re­al­ized fu­ture ac­tu­ally would have gained you more than one-ninth the value of the first $1 mil­lion. Is a fleet­ing emo­tional sense of cer­tainty over 1 minute, worth au­to­mat­i­cally dis­card­ing the po­ten­tial $5-mil­lion out­come? Even if the cor­rect an­swer given your val­ues is that you prop­erly ought to take the $1 mil­lion, trea­sur­ing 1 minute of emo­tional doesn’t seem like the wise rea­son to do that. The wise rea­son would be if the first $1 mil­lion re­ally was worth that much more than the next $4 mil­lion.

The dan­ger of say­ing, “Oh, well, I at­tach a lot of util­ity to that com­fortable feel­ing of cer­tainty, so my choices are co­her­ent af­ter all” is not that it’s math­e­mat­i­cally im­proper to value the emo­tions we feel while we’re de­cid­ing. Rather, by say­ing that the most valuable stakes are the emo­tions you feel dur­ing the minute you make the de­ci­sion, what you’re say­ing is, “I get a huge amount of value by mak­ing de­ci­sions how­ever hu­mans in­stinc­tively make their de­ci­sions, and that’s much more im­por­tant than the thing I’m mak­ing a de­ci­sion about.” This could well be true for some­thing like buy­ing a stuffed an­i­mal. If mil­lions of dol­lars or hu­man lives are at stake, maybe not so much.


The demon­stra­tions we’ve walked through here aren’t the pro­fes­sional-grade co­her­ence the­o­rems as they ap­pear in real math. Those have names like “Cox’s The­o­rem” or “the com­plete class the­o­rem”; their proofs are difficult; and they say things like “If see­ing piece of in­for­ma­tion A fol­lowed by piece of in­for­ma­tion B leads you into the same epistemic state as see­ing piece of in­for­ma­tion B fol­lowed by piece of in­for­ma­tion A, plus some other as­sump­tions, I can show an iso­mor­phism be­tween those epistemic states and clas­si­cal prob­a­bil­ities” or “Any de­ci­sion rule for tak­ing differ­ent ac­tions de­pend­ing on your ob­ser­va­tions ei­ther cor­re­sponds to Bayesian up­dat­ing given some prior, or else is strictly dom­i­nated by some Bayesian strat­egy”.

But hope­fully you’ve seen enough con­crete demon­stra­tions to get a gen­eral idea of what’s go­ing on with the ac­tual co­her­ence the­o­rems. We have mul­ti­ple spotlights all shin­ing on the same core math­e­mat­i­cal struc­ture, say­ing dozens of differ­ent var­i­ants on, “If you aren’t run­ning around in cir­cles or step­ping on your own feet or wan­tonly giv­ing up things you say you want, we can see your be­hav­ior as cor­re­spond­ing to this shape. Con­versely, if we can’t see your be­hav­ior as cor­re­spond­ing to this shape, you must be visi­bly shoot­ing your­self in the foot.” Ex­pected util­ity is the only struc­ture that has this great big fam­ily of dis­cov­ered the­o­rems all say­ing that. It has a scat­ter­ing of aca­demic com­peti­tors, be­cause academia is academia, but the com­peti­tors don’t have any­thing like that mass of spotlights all point­ing in the same di­rec­tion.

So if we need to pick an in­terim an­swer for “What kind of quan­ti­ta­tive frame­work should I try to put around my own de­ci­sion-mak­ing, when I’m try­ing to check if my thoughts make sense?” or “By de­fault and bar­ring spe­cial cases, what prop­er­ties might a suffi­ciently ad­vanced ma­chine in­tel­li­gence look to us like it had at least ap­prox­i­mately, if we couldn’t see it visi­bly run­ning around in cir­cles?”, then there’s pretty much one ob­vi­ous can­di­date: Prob­a­bil­ities, util­ity func­tions, and ex­pected util­ity.

Fur­ther reading

  • To learn more about agents and AI: In­ter­est­ing cog­ni­tion and be­hav­ior that can be de­rived just from the no­tion of ex­pected util­ity, fol­lowed by Is ex­pected util­ity a good way to think about the de­fault be­hav­ior of suffi­ciently ad­vanced Ar­tifi­cial In­tel­li­gences?

  • To learn more about de­ci­sion the­ory: The con­tro­ver­sial coun­ter­fac­tual at the heart of the ex­pected util­ity for­mula.


  • Expected utility formalism

    Ex­pected util­ity is the cen­tral idea in the quan­ti­ta­tive im­ple­men­ta­tion of consequentialism