# The­o­rem and proof

Sup­pose a se­quence $$X_1, \dots, X_n$$ of bi­nary val­ues (0 or 1), e.g., a po­ten­tially non-fair coin which comes up heads or tails on each flip.

Laplace’s Rule of suc­ces­sion says that if:

• 1. We think each coin­flip $$X_i$$ has an in­de­pen­dent, un­known like­li­hood $$f$$ of com­ing up heads, and

• 2. We think that all val­ues of $$f$$ be­tween 0 and 1 have equal prob­a­bil­ity den­sity a pri­ori

Then, af­ter ob­serv­ing $$M$$ heads and $$N$$ tails, the ex­pected prob­a­bil­ity of heads on the next coin­flip is:

$$\dfrac{M + 1}{M + N + 2}$$

Proof:

For a hy­po­thet­i­cal value of $$f$$, each coin­flip ob­served has a like­li­hood of $$f$$ if heads or $$1 - f$$ if tails.

The prior is uniform be­tween 0 and 1, so a prior den­sity of 1 ev­ery­where.

By Bayes’s Rule, af­ter see­ing M heads and N tails, the pos­te­rior prob­a­bil­ity den­sity over $$f$$ is pro­por­tional to $$1 \cdot f^M(1 - f)^N.$$

Then the nor­mal­iz­ing con­stant is: $$\int_0^1 f^M(1 - f)^N \operatorname{d}\!f = \frac{M!N!}{(M + N + 1)!}.$$

So the pos­te­rior prob­a­bil­ity den­sity func­tion is $$f^M(1 - f)^N \frac{(M + N + 1)!}{M!N!}.$$

In­te­grat­ing this func­tion, times $$f,$$ from 0 to 1, will yield the marginal prob­a­bil­ity of get­ting heads on the next flip.

The an­swer is thus:

$$\dfrac{(M+1)!N!}{(M + N + 2)!} \cdot \dfrac{(M + N + 1)!}{M!N!} = \dfrac{M + 1}{M + N + 2}.$$

# Sim­pler proof by combinatorics

Although Laplace’s Rule of Suc­ces­sion was origi­nally proved (by Thomas Bayes) by find­ing the pos­te­rior prob­a­bil­ity den­sity and in­te­grat­ing, and the proof of Laplace’s Rule illus­trates the core idea of an in­duc­tive prior in Bayesi­anism, a sim­pler in­tu­ition for the proof also ex­ists.

Con­sider the prob­lem origi­nally posed by Thomas Bayes: An ini­tial billiard is rol­led back and forth be­tween the left and right edges of an ideal billiards table un­til fric­tion brings it to a halt. We then roll M + N ad­di­tional billiard balls, and ob­serve that M halt to the left of the ini­tial billiard, and N halt to the right of it. If this is all we know, what is the prob­a­bil­ity the next ball halts on the left, or right?

Sup­pose that we rol­led a to­tal of 5 ad­di­tional billiards, and 2 halted to the left of the origi­nal, and 3 halted to the right. Then, us­ing | to sym­bol­ize the ini­tial billiard, the billiards would have come to rest in the or­der:

• LL|RRR

Sup­pose we now roll a new billiard, sym­bol­ized by +, un­til it comes to a halt. It’s equally likely to ap­pear at:

• +LL|RRR

• L+L|RRR

• LL+|RRR

• LL|+RRR

• LL|R+RR

• LL|RR+R

• LL|RRR+

This means there are 3 ways the ball could be or­dered on the left of the |, and 4 ways it could be or­dered on the right. Since all left-to-right or­der­ings of 7 ran­domly rol­led billiard balls are equally likely a pri­ori, we as­sign 37 prob­a­bil­ity that the ball comes to a rest on the left of the origi­nal ball’s po­si­tion.

# Use and abuse

Laplace’s Rule of Suc­ces­sion as­sumes that all prior val­ues of the fre­quency $$f$$ are undis­t­in­guished a pri­ori in our sub­jec­tive knowl­edge.

For ex­am­ple, Laplace used the rule to es­ti­mate a prob­a­bil­ity of the sun ris­ing to­mor­row, given that it had risen ev­ery day for the past 5000 years, and ar­rived at odds of around 1826251:1. But to­day when we have phys­i­cal knowl­edge of the Sun’s op­er­a­tion, not ev­ery pos­si­ble ‘rate at which the Sun rises each day’ is undis­t­in­guished. Fur­ther­more, even in Laplace’s time, he should have per­haps thought it es­pe­cially likely a pri­ori that “the Sun always rises” and “the Sun never rises” were dis­t­in­guished as un­usu­ally likely fre­quen­cies of the Sun ris­ing, a pri­ori.

The Rule of Suc­ces­sion fol­lows from as­sum­ing ap­prox­i­mate ig­no­rance about prior fre­quen­cies. It does not, of it­self, jus­tify this as­sump­tion. Vari­a­tions of the rule of suc­ces­sion are ob­tain­able by tak­ing differ­ent pri­ors, cor­re­spond­ing to differ­ent views of what should count as un­in­for­ma­tive. See dis­cus­sion on the Wikipe­dia page on non-in­for­ma­tive pri­ors. For ex­am­ple if start­ing with the Jeffreys’ prior, then af­ter ob­serv­ing $$M$$ heads and $$N$$ tails, the ex­pected prob­a­bil­ity of heads on the next coin­flip is:

$$\dfrac{M + \dfrac{1}{2}}{M + N + 1}$$

# Nomenclature

Laplace’s Rule of Suc­ces­sion was the fa­mous prob­lem proved by Thomas Bayes in “An Es­say to­wards solv­ing a Prob­lem in the Doc­trine of Chances”, read to the Royal So­ciety in 1763, af­ter Bayes’s death. Pierre-Si­mon Laplace, the first sys­tem­atizer of what we now know as Bayesian rea­son­ing, was so im­pressed by this the­o­rem that he named the cen­tral the­o­rem of his new dis­ci­pline af­ter Thomas Bayes. The origi­nal the­o­rem proven by Bayes was pop­u­larized by Laplace in ar­gu­ments about the prob­lem of in­duc­tion, and so be­came known as Laplace’s Rule of Suc­ces­sion.

Parents:

• Inductive prior

Some states of pre-ob­ser­va­tion be­lief can learn quickly; oth­ers never learn any­thing. An “in­duc­tive prior” is of the former type.

• This is redacted in a very con­fus­ing way.