# Proof of Bayes' rule

Bayes’ rule (in the odds form) says that, for ev­ery pair of hy­pothe­ses $$H_i$$ and $$H_j$$ and piece of ev­i­dence $$e,$$

$$\dfrac{\mathbb P(H_i)}{\mathbb P(H_j)} \times \dfrac{\mathbb P(e \mid H_i)}{\mathbb P(e \mid H_j)} = \dfrac{\mathbb P(H_i \mid e)}{\mathbb P(H_j \mid e)}.$$

By the defi­ni­tion of con­di­tional prob­a­bil­ity, $$\mathbb P(e \land H)$$ $$=$$ $$\mathbb P(H) \cdot \mathbb P(e \mid H),$$ so

$$\dfrac{\mathbb P(H_i)}{\mathbb P(H_j)} \times \dfrac{\mathbb P(e\mid H_i)}{\mathbb P(e\mid H_j)} = \dfrac{\mathbb P(e \wedge H_i)}{\mathbb P(e \wedge H_j)}$$

Di­vid­ing both the nu­mer­a­tor and the de­nom­i­na­tor by $$\mathbb P(e),$$ we have

$$\dfrac{\mathbb P(e \wedge H_i)}{\mathbb P(e \wedge H_j)} = \dfrac{\mathbb P(e \wedge H_i) / \mathbb P(e)}{\mathbb P(e \wedge H_j) / \mathbb P(e)}$$

In­vok­ing the defi­ni­tion of con­di­tional prob­a­bil­ity again,

$$\dfrac{\mathbb P(e \wedge H_i) / \mathbb P(e)}{\mathbb P(e \wedge H_j) / \mathbb P(e)} = \dfrac{\mathbb P(H_i\mid e)}{\mathbb P(H_j\mid e)}.$$

Done.

Of note is the equality

$$\frac{\mathbb P(H_i\mid e)}{\mathbb P(H_j\mid e)} = \frac{\mathbb P(H_i \land e)}{\mathbb P(H_j \land e)},$$

which says that the pos­te­rior odds (on the left) for $$H_i$$ (vs $$H_j$$) given ev­i­dence $$e$$ is ex­actly equal to the prior odds of $$H_i$$ (vs $$H_j$$) in the parts of $$\mathbb P$$ where $$e$$ was already true.$$\mathbb P(x \land e)$$ is the amount of prob­a­bil­ity mass that $$\mathbb P$$ al­lo­cated to wor­lds where both $$x$$ and $$e$$ are true, and the above equa­tion says that af­ter ob­serv­ing $$e,$$ your be­lief in $$H_i$$ rel­a­tive to $$H_j$$ should be equal to $$H_i$$‘s odds rel­a­tive to $$H_j$$ in those wor­lds. In other words, Bayes’ rule can be in­ter­preted as say­ing: “Once you’ve seen $$e$$, sim­ply throw away all prob­a­bil­ity mass ex­cept the mass on wor­lds where $$e$$ was true, and then con­tinue rea­son­ing ac­cord­ing to the re­main­ing prob­a­bil­ity mass.” See also Belief re­vi­sion as prob­a­bil­ity elimi­na­tion.

## Illus­tra­tion (us­ing the Dise­a­sitis ex­am­ple)

Spe­cial­iz­ing to the Dise­a­sitis prob­lem, us­ing red for sick, blue for healthy, and + signs for pos­i­tive test re­sults, the proof above can be vi­su­ally de­picted as fol­lows: This vi­su­al­iza­tion can be read as say­ing: The ra­tio of the ini­tial sick pop­u­la­tion (red) to the ini­tial healthy pop­u­la­tion (blue), times the ra­tio of pos­i­tive re­sults (+) in the sick pop­u­la­tion to pos­i­tive re­sults in the blue pop­u­la­tion, equals the ra­tio of the pos­i­tive-and-red pop­u­la­tion to pos­i­tive-and-blue pop­u­la­tion. Thus we can di­vide both into the pro­por­tion of the whole pop­u­la­tion which got pos­i­tive re­sults (grey and +), yield­ing the pos­te­rior odds of sick (red) vs healthy (blue) among only those with pos­i­tive re­sults.

The cor­re­spond­ing num­bers are:

$$\dfrac{20\%}{80\%} \times \dfrac{90\%}{30\%} = \dfrac{18\%}{24\%} = \dfrac{0.18 / 0.42}{0.24 / 0.42} = \dfrac{3}{4}$$

for a fi­nal prob­a­bil­ity $$\mathbb P(sick)$$ of $$\frac{3}{7} \approx 43\%.$$

## Generality

The odds and pro­por­tional forms of Bayes’ rule talk about the rel­a­tive prob­a­bil­ity of two hy­pothe­ses $$H_i$$ and $$H_j.$$ In the par­tic­u­lar ex­am­ple of Dise­a­sitis it hap­pens that ev­ery pa­tient is ei­ther sick or not-sick, so that we can nor­mal­ize the fi­nal odds 3 : 4 to prob­a­bil­ities of $$\frac{3}{7} : \frac{4}{7}.$$ How­ever, the proof above shows that even if we were talk­ing about two differ­ent pos­si­ble dis­eases and their to­tal prevalances did not sum to 1, the equa­tion above would still hold be­tween the rel­a­tive prior odds for $$\frac{\mathbb P(H_i)}{\mathbb P(H_j)}$$ and the rel­a­tive pos­te­rior odds for $$\frac{\mathbb P(H_i\mid e)}{\mathbb P(H_j\mid e)}.$$

The above proof can be spe­cial­ized to the prob­a­bil­is­tic case; see Proof of Bayes’ rule: Prob­a­bil­ity form.

Children:

Parents:

• Bayes' rule

Bayes’ rule is the core the­o­rem of prob­a­bil­ity the­ory say­ing how to re­vise our be­liefs when we make a new ob­ser­va­tion.