Conditional probability

The con­di­tional prob­a­bil­ity \(\mathbb{P}(X\mid Y)\) means “The prob­a­bil­ity of \(X\) given \(Y\).” That is, \(\mathbb P(left\mid right)\) means “The prob­a­bil­ity that \(left\) is true, as­sum­ing that \(right\) is true.”

\(\mathbb P(yellow\mid banana)\) is the prob­a­bil­ity that a ba­nana is yel­low—if we know some­thing to be a ba­nana, what is the prob­a­bil­ity that it is yel­low?

\(\mathbb P(banana\mid yellow)\) is the prob­a­bil­ity that a yel­low thing is a ba­nana—if the right side is known to be \(yellow\), then we ask the ques­tion on the left, what is the prob­a­bil­ity that this is a \(banana\)?


To ob­tain the prob­a­bil­ity \(\mathbb P(left \mid right),\) we con­strain our at­ten­tion to only cases where \(right\) is true, and ask about cases within \(right\) where \(left\) is also true.

Let \(X \wedge Y\) de­note ”\(X\) and \(Y\)” or ”\(X\) and \(Y\) are both true”. Then:

$$\mathbb P(left \mid right) = \dfrac{\mathbb P(left \wedge right)}{\mathbb P(right)}.$$

We can see this as a kind of “zoom­ing in” on only the cases where \(right\) is true, and ask­ing, within this uni­verse, for the cases where \(right\) and \(left\) are true.

Ex­am­ple 1

Sup­pose you have a bag con­tain­ing ob­jects that are ei­ther red or blue, and ei­ther square or round, where the num­ber of each is given by the fol­low­ing table:

$$\begin{array}{l\mid r\mid r} & Red & Blue \\ \hline Square & 1 & 2 \\ \hline Round & 3 & 4 \end{array}$$

If you reach in and feel a round ob­ject, the con­di­tional prob­a­bil­ity that it is red is given in by zoom­ing in on only the round ob­jects, and ask­ing about the fre­quency of ob­jects that are round and red in­side this zoomed-in view:

$$\mathbb P(red\mid round) = \dfrac{\mathbb P(red \wedge round)}{\mathbb P(round)} = \dfrac{3}{3 + 4} = \dfrac{3}{7}$$

If you look at the ob­ject near­est the top, and can see that it’s blue, but not see the shape, then the con­di­tional prob­a­bil­ity that it’s a square is:

$$\mathbb P(square\mid blue) = \dfrac{\mathbb P(square \wedge blue)}{\mathbb P(blue)} = \dfrac{2}{2 + 4} = \dfrac{1}{3}$$

conditional probabilities bag

Ex­am­ple 2

Sup­pose you’re Sher­lock Holmes in­ves­ti­gat­ing a case in which a red hair was left at the scene of the crime.

The Scot­land Yard de­tec­tive says, “Aha! Then it’s Miss Scar­let. She has red hair, so if she was the mur­derer she al­most cer­tainly would have left a red hair there. \(\mathbb P(red hair\mid Scarlet) = 99\%,\) let’s say, which is a near-cer­tain con­vic­tion, so we’re done.”

“But no,” replies Sher­lock Holmes. “You see, but you do not cor­rectly track the mean­ing of the con­di­tional prob­a­bil­ities, de­tec­tive. The knowl­edge we re­quire for a con­vic­tion is not \(\mathbb P(redhair\mid Scarlet),\) the chance that Miss Scar­let would leave a red hair, but rather \(\mathbb P(Scarlet\mid redhair),\) the chance that this red hair was left by Scar­let. There are other peo­ple in this city who have red hair.”

“So you’re say­ing…” the de­tec­tive said slowly, “that \(\mathbb P(redhair\mid Scarlet)\) is ac­tu­ally much lower than \(1\)?”

“No, de­tec­tive. I am say­ing that just be­cause \(\mathbb P(redhair\mid Scarlet)\) is high does not im­ply that \(\mathbb P(Scarlet\mid redhair)\) is high. It is the lat­ter prob­a­bil­ity in which we are in­ter­ested—the de­gree to which, know­ing that a red hair was left at the scene, we in­fer that Miss Scar­let was the mur­derer. This is not the same quan­tity as the de­gree to which, as­sum­ing Miss Scar­let was the mur­derer, we would guess that she might leave a red hair.”

“But surely,” said the de­tec­tive, “these two prob­a­bil­ities can­not be en­tirely un­re­lated?”

“Ah, well, for that, you must read up on Bayes’ rule.”

Ex­am­ple 3

“Even if most Dark Wizards are from Slytherin, very few Slyther­ins are Dark Wizards. There aren’t all that many Dark Wizards, so not all Slyther­ins can be one.”

“So yeh’re say­ing, that most Dark Wizards are Slyther­ins… but…”

“But most Slyther­ins are not Dark Wizards.”

— Harry Pot­ter and the Meth­ods of Ra­tion­al­ity, Ch. 100


  • Conditional probability: Refresher

    Is P(yel­low | ba­nana) the prob­a­bil­ity that a ba­nana is yel­low, or the prob­a­bil­ity that a yel­low thing is a ba­nana?


  • Probability

    The de­gree to which some­one be­lieves some­thing, mea­sured on a scale from 0 to 1, al­low­ing us to do math to it.