# Frequency diagrams: A first look at Bayes

Bayesian rea­son­ing is about how to re­vise our be­liefs in the light of ev­i­dence.

We’ll start by con­sid­er­ing one sce­nario in which the strength of the ev­i­dence has clear num­bers at­tached.

(Don’t worry if you don’t know how to solve the fol­low­ing prob­lem. We’ll see shortly how to solve it.)

Sup­pose you are a nurse screen­ing a set of stu­dents for a sick­ness called Dise­a­sitis.noteLiter­ally “in­flam­ma­tion of the dis­ease”.

• You know, from past pop­u­la­tion stud­ies, that around 20% of the stu­dents will have Dise­a­sitis at this time of year.

You are test­ing for Dise­a­sitis us­ing a color-chang­ing tongue de­pres­sor, which usu­ally turns black if the stu­dent has Dise­a­sitis.

• Among pa­tients with Dise­a­sitis, 90% turn the tongue de­pres­sor black.

• How­ever, the tongue de­pres­sor is not perfect, and also turns black 30% of the time for healthy stu­dents.

One of your stu­dents comes into the office, takes the test, and turns the tongue de­pres­sor black. What is the prob­a­bil­ity that they have Dise­a­sitis?

(If you think you see how to do it, you can try to solve this prob­lem be­fore con­tin­u­ing. To quickly see if you got your an­swer right, you can ex­pand the “An­swer” but­ton be­low; the deriva­tion will be given shortly.)

The prob­a­bil­ity a stu­dent with a black­ened tongue de­pres­sor has Dise­a­sitis is 37, roughly 43
. %

This prob­lem can be solved a hard way or a clever easy way. We’ll walk through the hard way first.

First, we imag­ine a pop­u­la­tion of 100 stu­dents, of whom 20 have Dise­a­sitis and 80 don’t.noteMul­ti­ple stud­ies show that think­ing about con­crete num­bers such as “20 out of 100 stu­dents” or “200 out of 1000 stu­dents” is more likely to pro­duce cor­rect spon­ta­neous rea­son­ing on these prob­lems than think­ing about per­centages like “20% of stu­dents.” E.g. “Prob­a­bil­is­tic rea­son­ing in clini­cal medicine” by David M. Eddy (1982).

90% of sick stu­dents turn their tongue de­pres­sor black, and 30% of healthy stu­dents turn the tongue de­pres­sor black. So we see black tongue de­pres­sors on 90% * 20 = 18 sick stu­dents, and 30% * 80 = 24 healthy stu­dents.

What’s the prob­a­bil­ity that a stu­dent with a black tongue de­pres­sor has Dise­a­sitis? From the di­a­gram, there are 18 sick stu­dents with black tongue de­pres­sors. 18 + 24 = 42 stu­dents in to­tal turned their tongue de­pres­sors black. Imag­ine reach­ing into a bag of all the stu­dents with black tongue de­pres­sors, and pul­ling out one of those stu­dents at ran­dom; what’s the chance a stu­dent like that is sick?

The fi­nal an­swer is that a pa­tient with a black tongue de­pres­sor has an 1842 = 37 = 43% prob­a­bil­ity of be­ing sick.

Many med­i­cal stu­dents have at first found this an­swer counter-in­tu­itive: The test cor­rectly de­tects Dise­a­sitis 90% of the time! If the test comes back pos­i­tive, why is it still less than 50% likely that the pa­tient has Dise­a­sitis? Well, the test also in­cor­rectly “de­tects” Dise­a­sitis 30% of the time in a healthy pa­tient, and we start out with lots more healthy pa­tients than sick pa­tients.

The test does provide some ev­i­dence in fa­vor of of the pa­tient be­ing sick. The prob­a­bil­ity of a pa­tient be­ing sick goes from 20% be­fore the test, to 43% af­ter we see the tongue de­pres­sor turn black. But this isn’t con­clu­sive, and we need to perform fur­ther tests, maybe more ex­pen­sive ones.

If you feel like you un­der­stand this prob­lem setup, con­sider try­ing to an­swer the fol­low­ing ques­tion be­fore pro­ceed­ing: What’s the prob­a­bil­ity that a stu­dent who does not turn the tongue de­pres­sor black—a stu­dent with a nega­tive test re­sult—has Dise­a­sitis? Again, we start out with 20% sick and 80% healthy stu­dents, 70% of healthy stu­dents will get a nega­tive test re­sult, and only 10% of sick stu­dents will get a nega­tive test re­sult.

Imag­ine 20 sick stu­dents and 80 healthy stu­dents. 10% * 20 = 2 sick stu­dents have nega­tive test re­sults. 70% * 80 = 56 healthy stu­dents have nega­tive test re­sults. Among the 2+56=58 to­tal stu­dents with nega­tive test re­sults, 2 stu­dents are sick stu­dents with nega­tive test re­sults. So 258 = 129 = 3.4% of stu­dents with nega­tive test re­sults have Dise­a­sitis.

if-be­fore(Water­fall di­a­grams and rel­a­tive odds): Now let’s turn to a faster, eas­ier way to solve the same prob­lem.
!if-be­fore(Water­fall di­a­grams and rel­a­tive odds): For a more clever way to perform the same calcu­la­tion, see Water­fall di­a­grams and rel­a­tive odds.

Parents:

• Frequency diagram

Vi­su­al­iz­ing Bayes’ rule by ma­nipu­lat­ing fre­quen­cies in large populations

• Bayes' rule

Bayes’ rule is the core the­o­rem of prob­a­bil­ity the­ory say­ing how to re­vise our be­liefs when we make a new ob­ser­va­tion.

• I think I’d find it eas­ier to un­der­stand if we were talk­ing about some­thing more con­crete, like strep throat.

• Ques­tion of in­ter­est.

• An­swer of in­ter­est.

• Another in­ter­est­ing ques­tion to ask is what is the sin­gle prob­a­blity of the tongue de­pres­sor to give ac­cu­rate re­sults ?

If we know that the tongue de­pres­sor has a 5% er­ror on all stu­dents, sicks and non sicks, we can pre­dict that the prob­a­bil­ity of be­ing sick is (an­other rule here ? or an­other fash­ion of us­ing the rule if ap­pli­ca­ble)

• Is this the same ques­tion as : what is the prob­a­blity that the test gives cor­rect re­sults ? We can also ask 3 other ques­tions :

• What is the prob­a­bil­ity of a stu­dent with a black tongue de­pres­sor do not have Dise­a­sitis ?

• What is the prob­a­bil­ity of a stu­dent witiout a black tongue de­pres­sor have Dise­a­sitis ?

• What is the prob­a­bil­ity of a stu­dent witiout a black tongue de­pres­sor do not have Dise­a­sitis ?

• Per­haps this can be em­pha­sized by use of bold char­ac­ters?

• One of the rea­sons the re­sults seem coun­ter­in­tu­itive is that the “a pri­ori” prob­a­bil­ity of some­one who comes to the clinic is gen­er­ally much higher than the prevalence of the dis­ease in the gen­eral pop­u­la­tion. About 20% of the gen­eral pop­u­la­tion has dis­e­a­sitis. Of the pop­u­la­tion that comes to the clinic (gen­er­ally be­cause they have symp­toms or has been in close con­tact with some­one with dis­e­a­sitis) that per­centage is likely much higher.

• I’m pretty sure that the ques­tion be­ing an­swered is “How to find the prob­a­bil­ity of hav­ing a dis­ease if you tested pos­i­tive for it.” I’m ob­serv­ing peo­ple in­ter­pret­ing this to mean “What is the ac­cu­racy of the test?” which is not the same thing.

Maybe add a bit to dis­t­in­guish the two ques­tions?

• for the sake of clar­ity please use “he/​she” in­stead of “they” … be­cause “they” might re­fer to “stu­dents”