Bayesian view of scientific virtues

A num­ber of sci­en­tific virtues are ex­plained in­tu­itively by Bayes’ rule, in­clud­ing:

  • Falsifi­a­bil­ity: A good sci­en­tist should say what they do not ex­pect to see if a the­ory is true.

  • Bold­ness: A good the­ory makes bold ex­per­i­men­tal pre­dic­tions (that we wouldn’t oth­er­wise ex­pect)

  • Pre­ci­sion: A good the­ory makes pre­cise ex­per­i­men­tal pre­dic­tions (that turn out cor­rect)

  • Falsifi­ca­tion­ism: Ac­cep­tance of a sci­en­tific the­ory is always pro­vi­sional; re­jec­tion of a sci­en­tific the­ory is pretty per­ma­nent.

  • Ex­per­i­men­ta­tion: You find bet­ter the­o­ries by mak­ing ob­ser­va­tions, and then up­dat­ing your be­liefs.


“Falsifi­a­bil­ity” means say­ing which events and ob­ser­va­tions should definitely not hap­pen if your the­ory is true.

This was first pop­u­larized as a sci­en­tific virtue by Karl Pop­per, who wrote, in a fa­mous cri­tique of Freudian psy­cho­anal­y­sis:

Nei­ther Freud nor Adler ex­cludes any par­tic­u­lar per­son’s act­ing in any par­tic­u­lar way, what­ever the out­ward cir­cum­stances. Whether a man sac­ri­ficed his life to res­cue a drown­ing child (a case of sub­li­ma­tion) or whether he mur­dered the child by drown­ing him (a case of re­pres­sion) could not pos­si­bly be pre­dicted or ex­cluded by Freud’s the­ory; the the­ory was com­pat­i­ble with ev­ery­thing that could hap­pen.

In a Bayesian sense, we can see a hy­poth­e­sis’s falsifi­a­bil­ity as a re­quire­ment for ob­tain­ing strong like­li­hood ra­tios in fa­vor of the hy­poth­e­sis, com­pared to, e.g., the al­ter­na­tive hy­poth­e­sis “I don’t know.”

Sup­pose you’re a very early re­searcher on grav­i­ta­tion, named Grek. Your friend Thag is hold­ing a rock in one hand, about to let it go. You need to pre­dict whether the rock will move down­ward to the ground, fly up­ward into the sky, or do some­thing else. That is, you must say how your the­ory \(Grek\) as­signs its prob­a­bil­ities over \(up, down,\) and \(other.\)

As it hap­pens, your friend Thag has his own the­ory \(Thag\) which says “Rocks do what they want to do.” If Thag sees the rock go down, he’ll ex­plain this by say­ing the rock wanted to go down. If Thag sees the rock go up, he’ll say the rock wanted to go up. Thag thinks that the Thag The­ory of Grav­i­ta­tion is a very good one be­cause it can ex­plain any pos­si­ble thing the rock is ob­served to do. This makes it su­pe­rior com­pared to a the­ory that could only ex­plain, say, the rock fal­ling down.

As a Bayesian, how­ever, you re­al­ize that since \(up, down,\) and \(other\) are mu­tu­ally ex­clu­sive and ex­haus­tive pos­si­bil­ities, and some­thing must hap­pen when Thag lets go of the rock, the con­di­tional prob­a­bil­ities \(\mathbb P(\cdot\mid Thag)\) must sum to \(\mathbb P(up\mid Thag) + \mathbb P(down\mid Thag) + \mathbb P(other\mid Thag) = 1.\)

If Thag is “equally good at ex­plain­ing” all three out­comes—if Thag’s the­ory is equally com­pat­i­ble with all three events and pro­duces equally clever ex­pla­na­tions for each of them—then we might as well call this \(1/3\) prob­a­bil­ity for each of \(\mathbb P(up\mid Thag), \mathbb P(down\mid Thag),\) and \(\mathbb P(other\mid Thag)\). Note that Thag the­ory’s is iso­mor­phic, in a prob­a­bil­is­tic sense, to say­ing “I don’t know.”

But now sup­pose Grek make falsifi­able pre­dic­tion! Grek say, “Most things fall down!”

Then Grek not have all prob­a­bil­ity mass dis­tributed equally! Grek put 95% of prob­a­bil­ity mass in \(\mathbb P(down\mid Grek)!\) Only leave 5% prob­a­bil­ity di­vided equally over \(\mathbb P(up\mid Grek)\) and \(\mathbb P(other\mid Grek)\) in case rock be­have like bird.

Thag say this bad idea. If rock go up, Grek The­ory of Grav­i­ta­tion dis­con­firmed by false pre­dic­tion! Com­pared to Thag The­ory that pre­dicts 13 chance of \(up,\) will be like­li­hood ra­tio of 2.5% : 33% ~ 1 : 13 against Grek The­ory! Grek em­bar­rassed!

Grek say, she is con­fi­dent rock does go down. Things like bird are rare. So Grek will­ing to stick out neck and face po­ten­tial em­bar­rass­ment. Be­sides, is more im­por­tant to learn about if Grek The­ory is true than to save face.

Thag let go of rock. Rock fall down.

This ev­i­dence with like­li­hood ra­tio of 0.95 : 0.33 ~ 3 : 1 fa­vor­ing Grek The­ory over Thag The­ory.

“How you get such big like­li­hood ra­tio?” Thag de­mand. “Thag never get big like­li­hood ra­tio!”

Grek ex­plain is pos­si­ble to ob­tain big like­li­hood ra­tio be­cause Grek The­ory stick out neck and take prob­a­bil­ity mass away from out­comes \(up\) and \(other,\) risk­ing dis­con­fir­ma­tion if that hap­pen. This free up lots of prob­a­bil­ity mass that Grek can put in out­come \(down\) to make big like­li­hood ra­tio if \(down\) hap­pen.

Grek The­ory win be­cause falsifi­able and make cor­rect pre­dic­tion! If falsifi­able and make wrong pre­dic­tion, Grek The­ory lose, but this okay be­cause Grek The­ory not Grek.

Ad­vance prediction

On the next ex­per­i­ment, Thag lets go of the rock, watches it fall down, and then says, “Thag The­ory as­sign 100% prob­a­bil­ity to \(\mathbb P(down\mid Thag)\).”

Grek replies, “Grek think if Thag see rock fly up in­stead, Thag would’ve said \(\mathbb P(up\mid Thag) = 1.\) Thag en­gage in hind­sight bias.”

“Grek can’t prove Thag bi­ased,” says Thag. “Grek make ad hominem ar­gu­ment.”

“New rule,” says Grek. “Every­one say prob­a­bil­ity as­sign­ment be­fore thing hap­pens. That way, no need to ar­gue af­ter­wards.”

Thag thinks. “Thag say \(\mathbb P(up\mid Thag) = 1\) and \(\mathbb P(down\mid Thag) = 1\).”

“Thag vi­o­late prob­a­bil­ity ax­ioms,” says Grek. “Prob­a­bil­ity of all mu­tu­ally ex­clu­sive out­comes must sum to \(1\) or less. But good thing Thag say in ad­vance so Grek can see prob­lem.”

“That not fair!” ob­jects Thag. “Should be al­lowed to say af­ter­wards so no­body can tell!”


The rule of ad­vance pre­dic­tion is much more prag­mat­i­cally im­por­tant for in­for­mal the­o­ries than for­mal ones; and for these pur­poses, a the­ory is ‘for­mal’ when the the­ory’s pre­dic­tions are pro­duced in a suffi­ciently me­chan­i­cal and de­ter­mined way that any­one can plug the the­ory into a com­puter and get the same an­swer for what prob­a­bil­ity the the­ory as­signs.

When New­ton’s The­ory of Grav­i­ta­tion was pro­posed, it was con­sid­ered not-yet-fully-proven be­cause retro­d­ic­tions such as the tides, el­lip­ti­cal plane­tary or­bits, and Ke­pler’s Laws, had all been ob­served be­fore New­ton pro­posed the the­ory. Even so, a prag­matic Bayesian would have given New­ton’s the­ory a lot of credit for these retro­d­ic­tions, be­cause un­like, say, a psy­cholog­i­cal the­ory of hu­man be­hav­ior, it was pos­si­ble for any­one—not just New­ton—to sit down with a pen­cil and de­rive ex­actly the same pre­dic­tions from New­ton’s Laws. This wouldn’t com­pletely elimi­nate the pos­si­bil­ity that New­ton’s The­ory had in some sense been overfit­ted to Ke­pler’s Laws and the tides, and would then be in­ca­pable of fur­ther cor­rect new pre­dic­tions. But it did mean that, as a for­mal the­ory, there could be less prag­matic worry that Thag­ton was just say­ing, “Oh, well, of course my the­ory of ‘Planets go where they want’ would pre­dict el­lip­ti­cal or­bits; el­lip­ti­cal or­bits look nice.”

Ask­ing a the­ory’s ad­her­ents what the the­ory says.

Thag picks up an­other rock. “I say in ad­vance that Grek The­ory as­sign 0% prob­a­bil­ity to rock go­ing down.” Thag drops the rock. “Thag dis­prove Grek The­ory!”

Grek shakes her head. “Should ask ad­vo­cates of Grek The­ory what Grek The­ory pre­dicts.” Grek picks up an­other rock. “I say Grek The­ory as­sign \(\mathbb P(down\mid Grek) = 0.95\).”

“I say Grek The­ory as­sign \(\mathbb P(down\mid Grek) = 0\),” coun­ters Thag.

“That not how sci­ence work,” replies Grek. “Thag should say what Thag’s The­ory says.”

Thag thinks for a mo­ment. “Thag The­ory says rock has 95% prob­a­bil­ity of go­ing down.”

“What?” says Grek. “Thag just copy­ing Grek The­ory! Also, Thag not say that be­fore see­ing rocks go down!”

Thag smiles smugly. “Only Thag get to say what Thag The­ory pre­dict, right?”

Again for prag­matic rea­sons, we should first ask the ad­her­ents of an in­for­mal the­ory to say what the the­ory pre­dicts (a for­mal the­ory can sim­ply be op­er­ated by any­one, and if this is not true, we will not call the the­ory ‘for­mal’).

Fur­ther­more, since you can find a fool fol­low­ing any cause, you should ask the smartest or most tech­ni­cally adept ad­vo­cates of the the­ory. If there’s any dis­pute about who those are, ask sep­a­rate rep­re­sen­ta­tives from the lead­ing groups. Fame is definitely not the key qual­ifier; you should ask Mur­ray Gell-Mann and not Deepak Cho­pra about quan­tum me­chan­ics, even if more peo­ple have heard of Deepak Cho­pra’s be­liefs about quan­tum me­chan­ics than have heard about Mur­ray Gell-Mann. If you re­ally can’t tell the differ­ence, ask them both, don’t ask only Cho­pra and then claim that Cho­pra gets to be the rep­re­sen­ta­tive be­cause he is most fa­mous.

Th­ese types of cour­tesy rules would not be nec­es­sary if we were deal­ing with a suffi­ciently ad­vanced Ar­tifi­cial In­tel­li­gence or ideal ra­tio­nal agent, but it makes sense for hu­man sci­ence where peo­ple may be mo­ti­vated to falsely con­strue an­other the­ory’s prob­a­bil­ity as­sign­ments.

This in­for­mal rule has its limits, and there may be cases where it seems re­ally ob­vi­ous that a hy­poth­e­sis’s pre­dic­tions ought not to be what the hy­poth­e­sis’s ad­her­ents claim, or that the the­ory’s ad­her­ents are just steal­ing the pre­dic­tions of a more suc­cess­ful the­ory. But there ought to be a large (if defea­si­ble) bias in fa­vor of let­ting a the­ory’s ad­her­ents say what that the­ory pre­dicts.


A few min­utes later, Grek is pick­ing up an­other rock. “$\mathbb P(down\mid Grek) = 0.95$,” says Grek.

“$\mathbb P(down\mid Thag) = 0.95$,” says Thag. “See, Thag as­sign high prob­a­bil­ity to out­comes ob­served. Thag win yet?”

“No,” says Grek. “Like­li­hood ra­tios 1 : 1 all time now, even if we be­lieve Thag. Thag’s the­ory not pick up ad­van­tage. Thag need to make bold pre­dic­tion other the­o­ries not make.”

Thag frowns. “Thag say… rock will turn blue when you let go this time? \(\mathbb P(blue\mid Thag) = 0.90\).”

“That very bold,” Grek says. “Grek The­ory not say that (nor any other ob­vi­ous ‘com­mon sense’ or ‘busi­ness as usual’ the­o­ries). Grek think that \(\mathbb P(blue\mid \neg Thag) < 0.01\) so Thag pre­dic­tion definitely has virtue of bold­ness. Will be big deal if Thag pre­dic­tion come true.”

\(\dfrac{\mathbb P(Thag\mid blue)}{\mathbb P(\neg Thag\mid blue)} > 90 \cdot \dfrac{\mathbb P(Thag)}{\mathbb P(\neg Thag)}\)

“Thag win now?” Thag says.

Grek lets go of the rock. It falls down to the ground. It does not turn blue.

“Bold pre­dic­tion not cor­rect,” Grek says. “Thag’s pre­dic­tion vir­tu­ous, but not win. Now Thag lose by 1 : 10 like­li­hood ra­tio in­stead. Very sci­ence, much falsifi­ca­tion.”

“Grek lure Thag into trap!” yells Thag.

“Look,” says Grek, “whole point is to set up sci­ence rules so cor­rect the­o­ries can win. If wrong the­o­rists lose quickly by try­ing to be sci­en­tifi­cally vir­tu­ous, is fea­ture rather than bug. But if Thag try to be good and loses, we shake hands and ev­ery­one still think well of Thag. Is nor­ma­tive so­cial ideal, any­way.”


At a some­what later stage in the de­vel­op­ment of grav­i­ta­tional the­ory, the Aris­totelian syn­the­sis of Grek and Thag’s the­o­ries, “Most things have a fi­nal des­ti­na­tion of be­ing at the cen­ter of the Earth, and try to ap­proach that fi­nal des­ti­na­tion” comes up against Gal­ileo Gal­ilei’s “Most un­sup­ported ob­jects ac­cel­er­ate down­wards, and each sec­ond of pass­ing time the ob­ject gains an­other 9.8 me­ters per sec­ond of down­ward speed; don’t ask me why, I’m just ob­serv­ing it.”

“You’re not just pre­dict­ing that rocks are ob­served to move down­ward when dropped, are you?” says Aris­to­tle. “Be­cause I’m already pre­dict­ing that.”

“What we’re go­ing to do next,” says Gal­ileo, “is pre­dict how long it will take a bowl­ing ball to fall from the Lean­ing Tower of Pisa.” Gal­ileo takes out a pocket stop­watch. “When my friend lets go of the ball, you hit the ‘start’ but­ton, and as soon as the ball hits the ground, you hit the ‘stop’ but­ton. We’re go­ing to ob­serve ex­actly what num­ber ap­pears on the watch.”

After some fur­ther cal­ibra­tion to de­ter­mine that Aris­to­tle has a pretty con­sis­tent re­ac­tion time for press­ing the stop­watch but­ton if Gal­ileo snaps his fingers, Aris­to­tle looks up at the bowl­ing ball be­ing held from the Lean­ing Tower of Pisa.

“I think it’ll take any­where be­tween 0 and 5 sec­onds in­clu­sive,” Aris­to­tle says. “Not sure be­yond that.”

“Okay,” says Gal­ileo. “I mea­sured this tower to be 45 me­ters tall. Now, if air re­sis­tance is 0, af­ter 3 sec­onds the ball should be mov­ing down­ward at a speed of 9.8 * 3 = 29.4 me­ters per sec­ond. That speed in­creases con­tin­u­ously over the 3 sec­onds, so the ball’s av­er­age speed will have been 29.4 /​ 2 = 14.7 me­ters per sec­ond. And if the ball moves at an av­er­age speed of 14.7 me­ters per sec­ond, for 3 sec­onds, it will travel down­ward 44.1 me­ters. So the ball should take just a lit­tle more than 3 sec­onds to fall 45 me­ters. Like, an ad­di­tional 1/​29th of a sec­ond or so.”

tower cartoon

“Hm,” says Aris­to­tle. “This pock­et­watch only mea­sures whole sec­onds, so your the­ory puts all its prob­a­bil­ity mass on 3, right?”

“Not liter­ally all its prob­a­bil­ity mass,” Gal­ileo says. “It takes you some time to press the stop­watch but­ton once you see the ball start to fall, but it also takes you some time to press the but­ton af­ter you see the ball hit the ground. Those two sources of mea­sure­ment er­ror should mostly can­cel out, but maybe they’ll hap­pen not to on this par­tic­u­lar oc­ca­sion. We don’t have all that pre­cise or well-tested an ex­per­i­men­tal setup here. Like, if the stop­watch breaks and we ob­serve a 0, then that will be a defeat for Gal­ilean grav­ity, but it wouldn’t im­ply a fi­nal re­fu­ta­tion—we could get an­other watch and make bet­ter pre­dic­tions and make up for the defeat.”

“Okay, so what prob­a­bil­ities do you as­sign?” says Aris­to­tle. “I think my own the­ory is about equally good at ex­plain­ing any fal­ling time be­tween 0 and 5 sec­onds.”

aristotle probability

Gal­ileo pon­ders. “Since we haven’t tested this setup yet… I think I’ll put some­thing like 90% of my prob­a­bil­ity mass on a fal­ling time be­tween 3 and 4 sec­onds, which cor­re­sponds to an ob­serv­able re­sult of the watch show­ing ‘3’. Maybe I’ll put an­other 5% prob­a­bil­ity on air re­sis­tance hav­ing a big­ger effect than I think it should over this dis­tance, so be­tween 4 and 5 sec­onds or an ob­serv­able of ‘4’. Another 4% prob­a­bil­ity on this watch be­ing slower than I thought, so 4% for a mea­sured time be­tween 2 and 3 and an ob­ser­va­tion of ‘2’. 0.99% prob­a­bil­ity on the stop­watch pick­ing this time to break and show ‘1’ (or ‘0’, but that we both agree shouldn’t hap­pen), and 0.01% prob­a­bil­ity on an ob­ser­va­tion of ‘2’ which ba­si­cally shouldn’t hap­pen for any rea­son I can think of.”

galileo probability

“Well,” says Aris­to­tle, “your the­ory cer­tainly has the sci­en­tific virtue of pre­ci­sion, in that, by con­cen­trat­ing most of its prob­a­bil­ity den­sity on a nar­row re­gion of pos­si­ble pre­cise ob­ser­va­tions, it will gain a great like­li­hood ad­van­tage over va­guer the­o­ries like mine, which roughly say that ‘things fall down’ and have made ‘suc­cess­ful pre­dic­tions’ each time things fall down, but which don’t pre­dict ex­actly how long they should take to fall. If your pre­dic­tion of ‘3’ comes true, that’ll be a 0.9 : 0.2 or 4.5 : 1 like­li­hood ra­tio fa­vor­ing Gal­ilean over Aris­totelian grav­ity.”

“Yes,” says Gal­ileo. “Of course, it’s not enough for the pre­dic­tion to be pre­cise, it also has to be cor­rect. If the watch shows ‘4’ in­stead, that’ll be a like­li­hood ra­tio of 0.05 : 0.20 or 1 : 4 against my the­ory. It’s bet­ter to be vague and right than to be pre­cise and wrong.”

Aris­to­tle nods. “Well, let’s test it, then.”

The bowl­ing ball is dropped.

The stop­watch shows 3 sec­onds.

“So do you be­lieve my the­ory yet?” says Gal­ileo.

“Well, I be­lieve it some­where in the range of four and a half times as much as I did pre­vi­ously,” says Aris­to­tle. “But that part where you’re plug­ging in num­bers like 9.8 and calcu­la­tions like the square of the time strike me as kinda com­pli­cated. Like, if I’m al­lowed to plug in num­bers that pre­cise, and do things like square them, there must be hun­dreds of differ­ent the­o­ries I could make which would be that com­pli­cated. By the quan­ti­ta­tive form of Oc­cam’s Ra­zor, we need to pe­nal­ize the prior prob­a­bil­ity of your the­ory for its al­gorith­mic com­plex­ity. One ob­ser­va­tion with a like­li­hood ra­tio of 4.5 : 1 isn’t enough to sup­port all that com­plex­ity. I’m not go­ing to be­lieve some­thing that com­pli­cated be­cause I see a stop­watch show­ing ‘3’ just that one time! I need to see more ob­jects dropped from var­i­ous differ­ent heights and ver­ify that the times are what you say they should be. If I say the prior com­plex­ity of your the­ory is, say, 20 bits, then 9 more ob­ser­va­tions like this would do it. Of course, I ex­pect you’ve already made more ob­ser­va­tions than that in pri­vate, but it only be­comes part of the pub­lic knowl­edge of hu­mankind af­ter some­one repli­cates it.”

“But of course,” says Gal­ileo. “I’d like to check your ex­per­i­men­tal setup and es­pe­cially your calcu­la­tions the first few times you try it, to make sure you’re not mea­sur­ing in feet in­stead of me­ters, or for­get­ting to halve the fi­nal speed to get the av­er­age speed, and so on. It’s a for­mal the­ory, but in prac­tice I want to check to make sure you’re not mak­ing a mis­take in calcu­lat­ing it.”

“Nat­u­rally,” says Aris­to­tle. “Wow, it sure is a good thing that we’re both Bayesi­ans and we both know the gov­ern­ing laws of prob­a­bil­ity the­ory and how they mo­ti­vate the in­for­mal so­cial pro­ce­dures we’re fol­low­ing, huh?”″

“Yes in­deed,” says Gal­ileo. “Other­wise we might have got­ten into a heated ar­gu­ment that could have lasted for hours.”


One of the rea­sons why Karl Pop­per was so en­am­ored of “falsifi­ca­tion” was the ob­ser­va­tion that falsifi­ca­tion, in sci­ence, is more definite and fi­nal than con­fir­ma­tion. A clas­sic parable along these lines is New­to­nian grav­i­ta­tion ver­sus Gen­eral Rel­a­tivity (Ein­stei­nian grav­i­ta­tion) - de­spite the tons and tons of ex­per­i­men­tal ev­i­dence for New­ton’s the­ory that had ac­cu­mu­lated up to the 19th cen­tury, there was no sense in which New­to­nian grav­ity had been fi­nally ver­ified, and in the end it was fi­nally dis­carded in fa­vor of Ein­stei­nian grav­ity. Now that New­ton’s grav­ity has been tossed on the trash-heap, though, there’s no re­al­is­tic prob­a­bil­ity of it ever com­ing back; the dis­card, un­like the adop­tion, is fi­nal.

Work­ing in the days be­fore Bayes be­came widely known, Pop­per put a log­i­cal in­ter­pre­ta­tion on this setup. Sup­pose \(H \rightarrow E,\) hy­poth­e­sis H log­i­cally im­plies that ev­i­dence E will be ob­served. If in­stead we ob­serve \(\neg E\) we can con­clude \(\neg H\) by the law of the con­tra­pos­i­tive. On the other hand, if we ob­serve \(E,\) we can’t log­i­cally con­clude \(H.\) So we can log­i­cally falsify a the­ory, but not log­i­cally ver­ify it.

Prag­mat­i­cally, this of­ten isn’t how sci­ence works.

In the nine­teenth cen­tury, ob­served anoma­lies were ac­cu­mu­lat­ing in the ob­ser­va­tion of Uranus’s or­bit. After tak­ing into ac­count all known in­fluences from all other planets, Uranus still was not ex­actly where New­ton’s the­ory said it should be. On the log­i­cal-falsifi­ca­tion view, since New­to­nian grav­i­ta­tion said that Uranus ought to be in a cer­tain pre­cise place and Uranus was not there, we ought to have be­come in­finitely cer­tain that New­ton’s the­ory was false. Sev­eral the­o­rists did sug­gest that New­ton’s the­ory might have a small er­ror term, and so be false in its origi­nal form.

The ac­tual out­come was that Ur­bain Le Ver­rier and John Couch Adams in­de­pen­dently sus­pected that the anomaly in Uranus’s or­bit could be ac­counted for by a pre­vi­ously un­ob­served eighth planet. And, rather than vaguely say that this was their hy­poth­e­sis, in a way that would just spread around the prob­a­bil­ity mass for Uranus’s lo­ca­tion and cause New­to­nian me­chan­ics to be not too falsified, Ver­rier and Adams in­de­pen­dently went on to calcu­late where the eighth planet ought to be. In 1846, Jo­hann Galle ob­served Nep­tune, based on Le Ver­rier’s ob­ser­va­tions—a tremen­dous triumph for New­to­nian me­chan­ics.

In 1859, Ur­bain Le Ver­rier rec­og­nized an­other prob­lem: Mer­cury was not ex­actly where it should be. While New­to­nian grav­ity did pre­dict that Mer­cury’s or­bit should pre­cess (the point of clos­est ap­proach to the Sun should it­self slowly ro­tate around the Sun), Mer­cury was pre­cess­ing by 38 arc-sec­onds per cen­tury more than it ought to be (later re­vised to 43). This anomaly was harder to ex­plain; Le Ver­rier thought there was [a tiny plane­toid or­bit­ing the Sun in­side the or­bit of Mer­cury](https://​​en.wikipe­​​wiki/​​Vul­can_(hy­po­thet­i­cal_planet)).

A bit more than half a cen­tury later, Ein­stein, work­ing on the equa­tions for Gen­eral Rel­a­tivity, re­al­ized that Mer­cury’s anoma­lous pre­ces­sion was ex­actly ex­plained by the equa­tions in their sim­plest and most el­e­gant form.

And that was the end of New­to­nian grav­i­ta­tion, per­ma­nently.

If we try to take Pop­per’s log­i­cal view of things, there’s no ob­vi­ous differ­ence be­tween the anomaly with Uranus and the anomaly with Mer­cury. In both cases, the straight­for­ward New­to­nian pre­dic­tion seemed to be falsified. If New­to­nian grav­i­ta­tion could bounce back from one log­i­cal dis­con­fir­ma­tion, why not the other?

From a Bayesian stand­point, we can see the differ­ence as fol­lows:

In the case of Uranus, there was no at­trac­tive al­ter­na­tive to New­to­nian me­chan­ics that was mak­ing bet­ter pre­dic­tions. The cur­rent the­ory seemed to be strictly con­fused about Uranus, in the sense that the cur­rent New­to­nian model was mak­ing con­fi­dent pre­dic­tions about Uranus that were much wronger than the the­ory ex­pected to be on av­er­age. This meant that there ought to be some bet­ter al­ter­na­tive. It didn’t say that the al­ter­na­tive had to be a non-New­to­nian one. The low \(\mathbb P(UranusLocation\mid currentNewton)\) cre­ated a po­ten­tial for some mod­ifi­ca­tion of the cur­rent model to make a bet­ter pre­dic­tion with higher \(\mathbb P(UranusLocation\mid newModel)\), but it didn’t say what had to change in the new model.

Even af­ter Nep­tune was ob­served, though, this wasn’t a fi­nal con­fir­ma­tion of New­to­nian me­chan­ics. While the new model as­signed very high \(\mathbb P(UranusLocation\mid Neptune \wedge Newton),\) there could, for all any­one knew, be some un­known Other the­ory that would as­sign equally high \(\mathbb P(UranusLocation\mid Neptune \wedge Other).\) In this case, New­ton’s the­ory would have no like­li­hood ad­van­tage ver­sus this un­known Other, so we could not say that New­ton’s the­ory of grav­ity had been con­firmed over ev­ery other pos­si­ble the­ory.

In the case of Mer­cury, when Ein­stein’s for­mal the­ory came along and as­signed much higher \(\mathbb P(MercuryLocation\mid Einstein)\) com­pared to \(\mathbb P(MercuryLocation\mid Newton),\) this cre­ated a huge like­li­hood ra­tio for Ein­stein over New­ton and drove the prob­a­bil­ity of New­ton’s the­ory very low. Even if some­day some other the­ory turns out to be bet­ter than Ein­stein, to do equally well at \(\mathbb P(MercuryLocation\mid Other)\) and also get even bet­ter \(\mathbb P(newObservation\mid Other),\) the fact that Ein­stein’s the­ory did do much bet­ter than New­ton on Mer­cury tells us that it’s pos­si­ble for sim­ple the­o­ries to do much bet­ter on Mer­cury, in a sim­ple way, that’s definitely not New­to­nian. So what­ever Other the­ory comes along will also do bet­ter on Mer­cury than \(\mathbb P(MercuryLocation\mid Newton)\) in a non-New­to­nian fash­ion, and New­ton will just be at a new, huge like­li­hood dis­ad­van­tage against this Other the­ory.

So—from a Bayesian stand­point—af­ter ex­plain­ing Mer­cury’s or­bital pre­ces­sion, we can’t be sure Ein­stein’s grav­i­ta­tion is cor­rect, but we can be sure that New­ton’s grav­i­ta­tion is wrong.

But this doesn’t re­flect a log­i­cal differ­ence be­tween falsifi­ca­tion and ver­ifi­ca­tion—ev­ery­thing takes place in­side a world of prob­a­bil­ities.

Pos­si­bil­ity of per­ma­nent confirmation

It’s worth not­ing that al­though New­ton’s the­ory of grav­i­ta­tion was false, some­thing very much like it was true. So while the be­lief “Planets move ex­actly like New­ton says” could only be pro­vi­sion­ally ac­cepted and was even­tu­ally over­turned, the be­lief, “All the kind of planets we’ve seen so far, in the kind of situ­a­tions we’ve seen so far, move pretty much like New­to­nian grav­ity says” was much more strongly con­firmed.

This im­plies that, con­tra Pop­per’s re­jec­tion of the very no­tion of con­fir­ma­tion, some the­o­ries can be fi­nally con­firmed, be­yond all rea­son­able doubt. E.g., the DNA the­ory of biolog­i­cal re­pro­duc­tion. No mat­ter what we won­der about quarks, there’s no plau­si­ble way we could be wrong about the ex­is­tence of molecules, or about there be­ing a dou­ble he­lix molecule that en­codes ge­netic in­for­ma­tion. It’s rea­son­able to say that the the­ory of DNA has been for­ever con­firmed be­yond a rea­son­able doubt, and will never go on the trash-heap of sci­ence no mat­ter what new ob­ser­va­tions may come.

This is pos­si­ble be­cause DNA is a non-fun­da­men­tal the­ory, given in terms like “molecules” and “atoms” rather than quarks. Even if quarks aren’t ex­actly what we think, there will be some­thing enough like quarks to un­der­lie the ob­jects we call pro­tons and neu­trons and the ex­is­tence of atoms and molecules above that, which means the ob­jects we call DNA will still be there in the new the­ory. In other words, the biolog­i­cal the­ory of DNA has a “some­thing sort of like this must be true” the­ory un­der­neath it. The hy­poth­e­sis that what Joseph Black called ‘fixed air’ and we call ‘car­bon diox­ide’, is in fact made up of one car­bon atom and two oxy­gen atoms, has been per­ma­nently con­firmed in a way that New­to­nian grav­ity was not per­ma­nently con­firmed.

There’s some amount of ob­ser­va­tion which would con­vince us that all sci­ence was a lie and there were fairies in the gar­den, but short of that, car­bon diox­ide is here to stay.

Nonethe­less, in or­di­nary sci­ence when we’re try­ing to figure out con­tro­ver­sies, work­ing to Bayes’ rule im­plies that a vir­tu­ous sci­en­tist should think like Karl Pop­per sug­gested:

  • Treat dis­con­fir­ma­tion as stronger than con­fir­ma­tion;

  • Only pro­vi­sion­ally ac­cept hy­pothe­ses that have a lot of fa­vor­able-seem­ing ev­i­dence;

  • Have some amount of dis­con­firm­ing ev­i­dence and pre­dic­tion-failures that makes you per­ma­nently put a hy­poth­e­sis on the trash-heap and give up hope on its re­s­ur­rec­tion;

  • Re­quire a qual­i­ta­tively more pow­er­ful kind of ev­i­dence than that, with di­rect ob­ser­va­tion of the phe­nomenon’s parts and pro­cesses in de­tail, be­fore you start think­ing of a the­ory as ‘con­firmed’.


What­ever the like­li­hood for \(\mathbb P(observation\mid hypothesis)\), it doesn’t change your be­liefs un­less you ac­tu­ally ex­e­cute the ex­per­i­ment, learn whether \(observation\) or \(\neg observation\) is true, and con­di­tion your be­liefs in or­der to up­date your prob­a­bil­ities.

In this sense, Bayes’ rule can also be said to mo­ti­vate the ex­per­i­men­tal method. Though you don’t nec­es­sar­ily need a lot of math to re­al­ize that draw­ing an ac­cu­rate map of a city re­quires look­ing at the city. Still, since the ex­per­i­men­tal method wasn’t quite ob­vi­ous for a lot of hu­man his­tory, it could maybe use all the sup­port it can get—in­clud­ing the cen­tral Bayesian idea of “Make ob­ser­va­tions to up­date your be­liefs.”


  • Bayesian update

    Bayesian up­dat­ing: the ideal way to change prob­a­bil­is­tic be­liefs based on ev­i­dence.