Frequency diagrams: A first look at Bayes
Bayesian reasoning is about how to revise our beliefs in the light of evidence.
We’ll start by considering one scenario in which the strength of the evidence has clear numbers attached.
(Don’t worry if you don’t know how to solve the following problem. We’ll see shortly how to solve it.)
Suppose you are a nurse screening a set of students for a sickness called Diseasitis.noteLiterally “inflammation of the disease”.
You know, from past population studies, that around 20% of the students will have Diseasitis at this time of year.
You are testing for Diseasitis using a color-changing tongue depressor, which usually turns black if the student has Diseasitis.
Among patients with Diseasitis, 90% turn the tongue depressor black.
However, the tongue depressor is not perfect, and also turns black 30% of the time for healthy students.
One of your students comes into the office, takes the test, and turns the tongue depressor black. What is the probability that they have Diseasitis?
(If you think you see how to do it, you can try to solve this problem before continuing. To quickly see if you got your answer right, you can expand the “Answer” button below; the derivation will be given shortly.)
This problem can be solved a hard way or a clever easy way. We’ll walk through the hard way first.
First, we imagine a population of 100 students, of whom 20 have Diseasitis and 80 don’t.noteMultiple studies show that thinking about concrete numbers such as “20 out of 100 students” or “200 out of 1000 students” is more likely to produce correct spontaneous reasoning on these problems than thinking about percentages like “20% of students.” E.g. “Probabilistic reasoning in clinical medicine” by David M. Eddy (1982).
90% of sick students turn their tongue depressor black, and 30% of healthy students turn the tongue depressor black. So we see black tongue depressors on 90% * 20 = 18 sick students, and 30% * 80 = 24 healthy students.
What’s the probability that a student with a black tongue depressor has Diseasitis? From the diagram, there are 18 sick students with black tongue depressors. 18 + 24 = 42 students in total turned their tongue depressors black. Imagine reaching into a bag of all the students with black tongue depressors, and pulling out one of those students at random; what’s the chance a student like that is sick?
The final answer is that a patient with a black tongue depressor has an 18⁄42 = 3⁄7 = 43% probability of being sick.
Many medical students have at first found this answer counter-intuitive: The test correctly detects Diseasitis 90% of the time! If the test comes back positive, why is it still less than 50% likely that the patient has Diseasitis? Well, the test also incorrectly “detects” Diseasitis 30% of the time in a healthy patient, and we start out with lots more healthy patients than sick patients.
The test does provide some evidence in favor of of the patient being sick. The probability of a patient being sick goes from 20% before the test, to 43% after we see the tongue depressor turn black. But this isn’t conclusive, and we need to perform further tests, maybe more expensive ones.
If you feel like you understand this problem setup, consider trying to answer the following question before proceeding: What’s the probability that a student who does not turn the tongue depressor black—a student with a negative test result—has Diseasitis? Again, we start out with 20% sick and 80% healthy students, 70% of healthy students will get a negative test result, and only 10% of sick students will get a negative test result.
- Frequency diagram
Visualizing Bayes’ rule by manipulating frequencies in large populations