Odds form to probability form

The odds form of Bayes’ rule works for any two hypotheses \(H_i\) and \(H_j,\) and looks like this:

$$\frac{\mathbb P(H_i \mid e)}{\mathbb P(H_j \mid e)} = \frac{\mathbb P(H_i)}{\mathbb P(H_j)} \times \frac{\mathbb P(e \mid H_i)}{\mathbb P(e \mid H_j)} \tag{1}.$$

The probabilistic form of Bayes’ rule requires a hypothesis set \(H_1,H_2,H_3,\ldots\) that is mutually exclusive and exhaustive, and looks like this:

$$\mathbb P(H_i\mid e) = \frac{\mathbb P(e\mid H_i) \cdot \mathbb P(H_i)}{\sum_k \mathbb P(e\mid H_k) \cdot \mathbb P(H_k)} \tag{2}.$$

We will now show that equation (2) follows from equation (1). Given a collection \(H_1,H_2,H_3,\ldots\) of mutually exclusive and exhaustive hypotheses and a hypothesis \(H_i\) from that collection, we can form another hypothesis \(\lnot H_i\) consisting of all the hypotheses \(H_1,H_2,H_3,\ldots\) except \(H_i.\) Then, using \(\lnot H_i\) as \(H_j\) and multiplying the fractions on the right-hand side of equation (1), we see that

$$\frac{\mathbb P(H_i \mid e)}{\mathbb P(\lnot H_i \mid e)} = \frac{\mathbb P(H_i) \cdot \mathbb P(e \mid H_i)}{\mathbb P(\lnot H_i)\cdot \mathbb P(e \mid \lnot H_i)}.$$

\(\mathbb P(\lnot H_i)\cdot \mathbb P(e \mid \lnot H_i)\) is the prior probability of \(\lnot H_i\) times the degree to which \(\lnot H_i\) predicted \(e.\) Because \(\lnot H_i\) is made of a bunch of mutually exclusive hypotheses, this term can be calculated by summing \(\mathbb P(H_k) \cdot \mathbb P(e \mid H_k)\) for every \(H_k\) in the collection except \(H_i.\) Performing that replacement, and swapping the order of multiplication, we get:

$$\frac{\mathbb P(H_i \mid e)}{\mathbb P(\lnot H_i \mid e)} = \frac{\mathbb P(e \mid H_i) \cdot \mathbb P(H_i)}{\sum_{k \neq i} \mathbb P(e \mid H_k) \cdot \mathbb P(H_k)}.$$

These are the posterior odds for \(H_i\) versus \(\lnot H_i.\) Because \(H_i\) and \(\lnot H_i\) are mutually exclusive and exhaustive, we can convert these odds into a probability for \(H_i,\) by calculating numerator /​ (numerator + denominator), in the same way that \(3 : 4\) odds become a 3 /​ (3 + 4) probability. When we do so, equation (2) drops out:

$$\mathbb P(H_i\mid e) = \frac{\mathbb P(e\mid H_i) \cdot \mathbb P(H_i)}{\sum_k \mathbb P(e\mid H_k) \cdot \mathbb P(H_k)}.$$

Thus, we see that the probabilistic formulation of Bayes’ rule follows from the odds form, but is less general, in that it only works when the set of hypotheses being considered are mutually exclusive and exhaustive.

We also see that the probabilistic formulation converts the posterior odds into a posterior probability. When computing multiple updates in a row, you actually only need to perform this “normalization” step once at the very end of your calculations — which means that the odds form of Bayes’ rule is also more efficient in practice.