Proof of Bayes' rule

Bayes’ rule (in the odds form) says that, for ev­ery pair of hy­pothe­ses \(H_i\) and \(H_j\) and piece of ev­i­dence \(e,\)

$$\dfrac{\mathbb P(H_i)}{\mathbb P(H_j)} \times \dfrac{\mathbb P(e \mid H_i)}{\mathbb P(e \mid H_j)} = \dfrac{\mathbb P(H_i \mid e)}{\mathbb P(H_j \mid e)}.$$

By the defi­ni­tion of con­di­tional prob­a­bil­ity, \(\mathbb P(e \land H)\) \(=\) \(\mathbb P(H) \cdot \mathbb P(e \mid H),\) so

$$ \dfrac{\mathbb P(H_i)}{\mathbb P(H_j)} \times \dfrac{\mathbb P(e\mid H_i)}{\mathbb P(e\mid H_j)} = \dfrac{\mathbb P(e \wedge H_i)}{\mathbb P(e \wedge H_j)} $$

Di­vid­ing both the nu­mer­a­tor and the de­nom­i­na­tor by \(\mathbb P(e),\) we have

$$ \dfrac{\mathbb P(e \wedge H_i)}{\mathbb P(e \wedge H_j)} = \dfrac{\mathbb P(e \wedge H_i) / \mathbb P(e)}{\mathbb P(e \wedge H_j) / \mathbb P(e)} $$

In­vok­ing the defi­ni­tion of con­di­tional prob­a­bil­ity again,

$$ \dfrac{\mathbb P(e \wedge H_i) / \mathbb P(e)}{\mathbb P(e \wedge H_j) / \mathbb P(e)} = \dfrac{\mathbb P(H_i\mid e)}{\mathbb P(H_j\mid e)}.$$


Of note is the equality

$$\frac{\mathbb P(H_i\mid e)}{\mathbb P(H_j\mid e)} = \frac{\mathbb P(H_i \land e)}{\mathbb P(H_j \land e)},$$

which says that the pos­te­rior odds (on the left) for \(H_i\) (vs \(H_j\)) given ev­i­dence \(e\) is ex­actly equal to the prior odds of \(H_i\) (vs \(H_j\)) in the parts of \(\mathbb P\) where \(e\) was already true.\(\mathbb P(x \land e)\) is the amount of prob­a­bil­ity mass that \(\mathbb P\) al­lo­cated to wor­lds where both \(x\) and \(e\) are true, and the above equa­tion says that af­ter ob­serv­ing \(e,\) your be­lief in \(H_i\) rel­a­tive to \(H_j\) should be equal to \(H_i\)‘s odds rel­a­tive to \(H_j\) in those wor­lds. In other words, Bayes’ rule can be in­ter­preted as say­ing: “Once you’ve seen \(e\), sim­ply throw away all prob­a­bil­ity mass ex­cept the mass on wor­lds where \(e\) was true, and then con­tinue rea­son­ing ac­cord­ing to the re­main­ing prob­a­bil­ity mass.” See also Belief re­vi­sion as prob­a­bil­ity elimi­na­tion.

Illus­tra­tion (us­ing the Dise­a­sitis ex­am­ple)

Spe­cial­iz­ing to the Dise­a­sitis prob­lem, us­ing red for sick, blue for healthy, and + signs for pos­i­tive test re­sults, the proof above can be vi­su­ally de­picted as fol­lows:

bayes venn

This vi­su­al­iza­tion can be read as say­ing: The ra­tio of the ini­tial sick pop­u­la­tion (red) to the ini­tial healthy pop­u­la­tion (blue), times the ra­tio of pos­i­tive re­sults (+) in the sick pop­u­la­tion to pos­i­tive re­sults in the blue pop­u­la­tion, equals the ra­tio of the pos­i­tive-and-red pop­u­la­tion to pos­i­tive-and-blue pop­u­la­tion. Thus we can di­vide both into the pro­por­tion of the whole pop­u­la­tion which got pos­i­tive re­sults (grey and +), yield­ing the pos­te­rior odds of sick (red) vs healthy (blue) among only those with pos­i­tive re­sults.

The cor­re­spond­ing num­bers are:

$$\dfrac{20\%}{80\%} \times \dfrac{90\%}{30\%} = \dfrac{18\%}{24\%} = \dfrac{0.18 / 0.42}{0.24 / 0.42} = \dfrac{3}{4}$$

for a fi­nal prob­a­bil­ity \(\mathbb P(sick)\) of \(\frac{3}{7} \approx 43\%.\)


The odds and pro­por­tional forms of Bayes’ rule talk about the rel­a­tive prob­a­bil­ity of two hy­pothe­ses \(H_i\) and \(H_j.\) In the par­tic­u­lar ex­am­ple of Dise­a­sitis it hap­pens that ev­ery pa­tient is ei­ther sick or not-sick, so that we can nor­mal­ize the fi­nal odds 3 : 4 to prob­a­bil­ities of \(\frac{3}{7} : \frac{4}{7}.\) How­ever, the proof above shows that even if we were talk­ing about two differ­ent pos­si­ble dis­eases and their to­tal prevalances did not sum to 1, the equa­tion above would still hold be­tween the rel­a­tive prior odds for \(\frac{\mathbb P(H_i)}{\mathbb P(H_j)}\) and the rel­a­tive pos­te­rior odds for \(\frac{\mathbb P(H_i\mid e)}{\mathbb P(H_j\mid e)}.\)

The above proof can be spe­cial­ized to the prob­a­bil­is­tic case; see Proof of Bayes’ rule: Prob­a­bil­ity form.



  • Bayes' rule

    Bayes’ rule is the core the­o­rem of prob­a­bil­ity the­ory say­ing how to re­vise our be­liefs when we make a new ob­ser­va­tion.