Likelihood function

Let’s say you have a piece of ev­i­dence \(e\) and a set of hy­pothe­ses \(\mathcal H.\) Each \(H_i \in \mathcal H\) as­signs some like­li­hood to \(e.\) The func­tion \(\mathcal L_{e}(H_i)\) that re­ports this like­li­hood for each \(H_i \in \mathcal H\) is known as a “like­li­hood func­tion.”

For ex­am­ple, let’s say that the ev­i­dence is \(e_c\) = “Mr. Boddy was kil­led with a can­dle­stick,” and the hy­pothe­ses are \(H_S\) = “Miss Scar­lett did it,” \(H_M\) = “Colonel Mus­tard did it,” and \(H_P\) = “Mrs. Pea­cock did it.” Fur­ther­more, if Miss Scar­lett was the mur­derer, she’s 20% likely to have used a can­dle­stick. By con­trast, if Colonel Mus­tard did it, he’s 10% likely to have used a can­dle­stick, and if Mrs. Pea­cock did it, she’s only 1% likely to have used a can­dle­stick. In this case, the like­li­hood func­tion is

$$\mathcal L_{e_c}(h) = \begin{cases} 0.2 & \text{if $h = H_S$} \\ 0.1 & \text{if $h = H_M$} \\ 0.01 & \text{if $h = H_P$} \\ \end{cases} $$

For an ex­am­ple us­ing a con­tin­u­ous func­tion, con­sider a pos­si­bly-bi­ased coin whose bias \(b\) to come up heads on any par­tic­u­lar coin­flip might be any­where be­tween \(0\) and \(1\). Sup­pose we ob­serve the coin to come up heads, tails, and tails. We will de­note this ev­i­dence \(e_{HTT}.\) The like­li­hood func­tion over each hy­poth­e­sis \(H_b\) = “the coin is bi­ased to come up heads \(b\) por­tion of the time” for \(b \in [0, 1]\) is:

$$\mathcal L_{e_{HTT}}(H_b) = b\cdot (1-b)\cdot (1-b).$$

There’s no rea­son to nor­mal­ize like­li­hood func­tions so that they sum to 1 — they aren’t prob­a­bil­ity dis­tri­bu­tions, they’re func­tions ex­press­ing each hy­poth­e­sis’ propen­sity to yield the ob­served ev­i­dence. For ex­am­ple, if the ev­i­dence was re­ally ob­vi­ous (\(e_s\) = “the sun rose this morn­ing,”) it might be the case that al­most all hy­pothe­ses have a very high like­li­hood, in which case the sum of the like­li­hood func­tion will be much more than 1.

Like­li­hood func­tions carry ab­solute like­li­hood in­for­ma­tion, and there­fore, they con­tain in­for­ma­tion that rel­a­tive like­li­hoods do not. Namely, ab­solute like­li­hoods can be used to check a hy­poth­e­sis for strict con­fu­sion.