# Belief revision as probability elimination

One way of un­der­stand­ing the rea­son­ing be­hind Bayes’ rule is that the pro­cess of up­dat­ing $$\mathbb P$$ in the face of new ev­i­dence can be in­ter­preted as the elimi­na­tion of prob­a­bil­ity mass from $$\mathbb P$$ (namely, all the prob­a­bil­ity mass in­con­sis­tent with the ev­i­dence).

todo: we have a re­quest to use a not-Dise­a­sitis prob­lem here, be­cause it was get­ting repet­i­tive.

!if-af­ter(Fre­quency di­a­grams: A first look at Bayes): We’ll use as a first ex­am­ple the Dise­a­sitis prob­lem:

You are screen­ing a set of pa­tients for a dis­ease, which we’ll call Dise­a­sitis. You ex­pect that around 20% of the pa­tients in the screen­ing pop­u­la­tion start out with Dise­a­sitis. You are test­ing for the pres­ence of the dis­ease us­ing a tongue de­pres­sor with a sen­si­tive chem­i­cal strip. Among pa­tients with Dise­a­sitis, 90% turn the tongue de­pres­sor black. How­ever, 30% of the pa­tients with­out Dise­a­sitis will also turn the tongue de­pres­sor black. One of your pa­tients comes into the office, takes your test, and turns the tongue de­pres­sor black. Given only that in­for­ma­tion, what is the prob­a­bil­ity that they have Dise­a­sitis? <div>

if-af­ter(Fre­quency di­a­grams: A first look at Bayes): We’ll again start with the Dise­a­sitis ex­am­ple: a pop­u­la­tion has a 20% prior prevalence of Dise­a­sitis, and we use a test with a 90% true-pos­i­tive rate and a 30% false-pos­i­tive rate.

In the situ­a­tion with a sin­gle, in­di­vi­d­ual pa­tient, be­fore ob­serv­ing any ev­i­dence, there are four pos­si­ble wor­lds we could be in:

todo: change LaTeX to show Sick in red, Healthy in blue

$$\begin{array}{l|r|r} & Sick & Healthy \\ \hline Test + & 18\% & 24\% \\ \hline Test - & 2\% & 56\% \end{array}$$

To ob­serve that the pa­tient gets a pos­i­tive re­sult, is to elimi­nate from fur­ther con­sid­er­a­tion the pos­si­ble wor­lds where the pa­tient gets a nega­tive re­sult, and vice versa:

So Bayes’ rule says: to up­date your be­liefs in the face of ev­i­dence, sim­ply throw away the prob­a­bil­ity mass that was in­con­sis­tent with the ev­i­dence.

# Ex­am­ple: Socks-dresser problem

Real­iz­ing that ob­serv­ing ev­i­dence cor­re­sponds to elimi­nat­ing prob­a­bil­ity mass and con­cern­ing our­selves only with the prob­a­bil­ity mass that re­mains, is the key to solv­ing the sock-dresser search prob­lem:

You left your socks some­where in your room. You think there’s a 45 chance that they’ve been tossed into some ran­dom drawer of your dresser, so you start look­ing through your dresser’s 8 draw­ers. After check­ing 6 draw­ers at ran­dom, you haven’t found your socks yet. What is the prob­a­bil­ity you will find your socks in the next drawer you check?

todo: re­quest for a pic­ture here

(You can op­tion­ally try to solve this prob­lem your­self be­fore con­tin­u­ing.)

We ini­tially have 20% of the prob­a­bil­ity mass in “Socks out­side the dresser”, and 80% of the prob­a­bil­ity mass for “Socks in­side the dresser”. This cor­re­sponds to 10% prob­a­bil­ity mass for each of the 8 draw­ers (be­cause each of the 8 draw­ers is equally likely to con­tain the socks).

After elimi­nat­ing the prob­a­bil­ity mass in 6 of the draw­ers, we have 40% of the origi­nal mass re­main­ing, 20% for “Socks out­side the dresser” and 10% each for the re­main­ing 2 draw­ers.

Since this re­main­ing 40% prob­a­bil­ity mass is now our whole world, the effect on our prob­a­bil­ity dis­tri­bu­tion is like am­plify­ing the 40% un­til it ex­pands back up to 100%, aka renor­mal­iz­ing the prob­a­bil­ity dis­tri­bu­tion. Within the re­main­ing prior prob­a­bil­ity mass of 40%, the “out­side the dresser” hy­poth­e­sis has half of it (prior 20%), and the two draw­ers have a quar­ter each (prior 10% each).

So the prob­a­bil­ity of find­ing our socks in the next drawer is 25%.

For some more fla­vor­ful ex­am­ples of this method of us­ing Bayes’ rule, see The ups and downs of the hope func­tion in a fruitless search.

<div><div>

# Ex­ten­sion to sub­jec­tive probability

On the Bayesian paradigm, this idiom of be­lief re­vi­sion as con­di­tion­ing a prob­a­bil­ity dis­tri­bu­tion on ev­i­dence works both in cases where there are statis­ti­cal pop­u­la­tions with ob­jec­tive fre­quen­cies cor­re­spond­ing to the prob­a­bil­ities, and in cases where our un­cer­tainty is sub­jec­tive.

For ex­am­ple, imag­ine be­ing a king think­ing about a uniquely weird per­son who seems around 20% likely to be an as­sas­sin. This doesn’t mean that there’s a pop­u­la­tion of similar peo­ple of whom 20% are as­sas­s­ins; it means that you weighed up your un­cer­tainty and guesses and de­cided that you would bet at odds of 1 : 4 that they’re an as­sas­sin.

You then es­ti­mate that, if this per­son is an as­sas­sin, they’re 90% likely to own a dag­ger — so far as your sub­jec­tive un­cer­tainty goes; if you imag­ine them be­ing an as­sas­sin, you think that 9 : 1 would be good bet­ting odds for that. If this par­tic­u­lar per­son is not an as­sas­sin, you feel like the prob­a­bil­ity that she ought to own a dag­ger is around 30%.

When you have your guards search her, and they find a dag­ger, then (ac­cord­ing to stu­dents of Bayes’ rule) you should up­date your be­liefs in the same way you up­date your be­lief in the Dise­a­sitis set­ting — where there is a large pop­u­la­tion with an ob­jec­tive fre­quency of sick­ness — de­spite the fact that this maybe-as­sas­sin is a unique case. Ac­cord­ing to a Bayesian, your brain can track the prob­a­bil­ities of differ­ent pos­si­bil­ities re­gard­less, even when there are no large pop­u­la­tions and ob­jec­tive fre­quen­cies any­where to be found, and when you up­date your be­liefs us­ing ev­i­dence, you’re not “elimi­nat­ing peo­ple from con­sid­er­a­tion,” you’re elimi­nat­ing prob­a­bil­ity mass from cer­tain pos­si­ble wor­lds rep­re­sented in your own sub­jec­tive be­lief state.

todo: add an­swer check for assassin

Parents:

• Bayes' rule

Bayes’ rule is the core the­o­rem of prob­a­bil­ity the­ory say­ing how to re­vise our be­liefs when we make a new ob­ser­va­tion.

• From ear­lier pages, this will be harder than it ini­tially ap­pears.

Say that, on hear­ing there was a dag­ger, the king also pon­ders what other things an as­sas­sin might carry. Poi­sons, for sure—an as­sas­sin would have them like 6:1 odds at least, and a non-as­sas­sin would be surely no more than 0.02:1.

Listen­ing de­vices, for spy­ing and stuff. Prob­a­bly 4:1 and 0.02:1 again.

Naive Bayesian anal­y­sis would say that doc­tors, with their knives and stetho­scopes and drugs, would be car­ry­ing tools with odds of 9 * 6 * 4 : 1 = 216 : 1 if they were an as­sas­sin, and 0.02 * 0.02 * 0.3 = 0.00012 : 1 if they were not.

It’s very easy to for­get that “up­dated odds” are also “up­dated con­di­tions”. The king has not up­dated his odds of that per­son be­ing an as­sas­sin; they have re­placed their odds of “the per­son” be­ing an as­sas­sin, with some new odds of “the per­son with the knife” be­ing an as­sas­sin.

Sub­se­quent odds need to be added in that light. The need to ac­cu­rately track the de­creas­ing weight of sub­se­quent pieces of ev­i­dence feels like it is un­likely to be in­tu­itively grasped by most.