Odds form to probability form

The odds form of Bayes’ rule works for any two hy­pothe­ses \(H_i\) and \(H_j,\) and looks like this:

$$\frac{\mathbb P(H_i \mid e)}{\mathbb P(H_j \mid e)} = \frac{\mathbb P(H_i)}{\mathbb P(H_j)} \times \frac{\mathbb P(e \mid H_i)}{\mathbb P(e \mid H_j)} \tag{1}.$$

The prob­a­bil­is­tic form of Bayes’ rule re­quires a hy­poth­e­sis set \(H_1,H_2,H_3,\ldots\) that is mu­tu­ally ex­clu­sive and ex­haus­tive, and looks like this:

$$\mathbb P(H_i\mid e) = \frac{\mathbb P(e\mid H_i) \cdot \mathbb P(H_i)}{\sum_k \mathbb P(e\mid H_k) \cdot \mathbb P(H_k)} \tag{2}.$$

We will now show that equa­tion (2) fol­lows from equa­tion (1). Given a col­lec­tion \(H_1,H_2,H_3,\ldots\) of mu­tu­ally ex­clu­sive and ex­haus­tive hy­pothe­ses and a hy­poth­e­sis \(H_i\) from that col­lec­tion, we can form an­other hy­poth­e­sis \(\lnot H_i\) con­sist­ing of all the hy­pothe­ses \(H_1,H_2,H_3,\ldots\) ex­cept \(H_i.\) Then, us­ing \(\lnot H_i\) as \(H_j\) and mul­ti­ply­ing the frac­tions on the right-hand side of equa­tion (1), we see that

$$\frac{\mathbb P(H_i \mid e)}{\mathbb P(\lnot H_i \mid e)} = \frac{\mathbb P(H_i) \cdot \mathbb P(e \mid H_i)}{\mathbb P(\lnot H_i)\cdot \mathbb P(e \mid \lnot H_i)}.$$

\(\mathbb P(\lnot H_i)\cdot \mathbb P(e \mid \lnot H_i)\) is the prior prob­a­bil­ity of \(\lnot H_i\) times the de­gree to which \(\lnot H_i\) pre­dicted \(e.\) Be­cause \(\lnot H_i\) is made of a bunch of mu­tu­ally ex­clu­sive hy­pothe­ses, this term can be calcu­lated by sum­ming \(\mathbb P(H_k) \cdot \mathbb P(e \mid H_k)\) for ev­ery \(H_k\) in the col­lec­tion ex­cept \(H_i.\) Perform­ing that re­place­ment, and swap­ping the or­der of mul­ti­pli­ca­tion, we get:

$$\frac{\mathbb P(H_i \mid e)}{\mathbb P(\lnot H_i \mid e)} = \frac{\mathbb P(e \mid H_i) \cdot \mathbb P(H_i)}{\sum_{k \neq i} \mathbb P(e \mid H_k) \cdot \mathbb P(H_k)}.$$

Th­ese are the pos­te­rior odds for \(H_i\) ver­sus \(\lnot H_i.\) Be­cause \(H_i\) and \(\lnot H_i\) are mu­tu­ally ex­clu­sive and ex­haus­tive, we can con­vert these odds into a prob­a­bil­ity for \(H_i,\) by calcu­lat­ing nu­mer­a­tor /​ (nu­mer­a­tor + de­nom­i­na­tor), in the same way that \(3 : 4\) odds be­come a 3 /​ (3 + 4) prob­a­bil­ity. When we do so, equa­tion (2) drops out:

$$\mathbb P(H_i\mid e) = \frac{\mathbb P(e\mid H_i) \cdot \mathbb P(H_i)}{\sum_k \mathbb P(e\mid H_k) \cdot \mathbb P(H_k)}.$$

Thus, we see that the prob­a­bil­is­tic for­mu­la­tion of Bayes’ rule fol­lows from the odds form, but is less gen­eral, in that it only works when the set of hy­pothe­ses be­ing con­sid­ered are mu­tu­ally ex­clu­sive and ex­haus­tive.

We also see that the prob­a­bil­is­tic for­mu­la­tion con­verts the pos­te­rior odds into a pos­te­rior prob­a­bil­ity. When com­put­ing mul­ti­ple up­dates in a row, you ac­tu­ally only need to perform this “nor­mal­iza­tion” step once at the very end of your calcu­la­tions — which means that the odds form of Bayes’ rule is also more effi­cient in prac­tice.