Vingean uncertainty

Of course, I never wrote the “im­por­tant” story, the se­quel about the first am­plified hu­man. Once I tried some­thing similar. John Camp­bell’s let­ter of re­jec­tion be­gan: “Sorry—you can’t write this story. Nei­ther can any­one else.”… “Book­worm, Run!” and its les­son were im­por­tant to me. Here I had tried a straight­for­ward ex­trap­o­la­tion of tech­nol­ogy, and found my­self pre­cip­i­tated over an abyss. It’s a prob­lem writ­ers face ev­ery time we con­sider the cre­ation of in­tel­li­gences greater than our own. When this hap­pens, hu­man his­tory will have reached a kind of sin­gu­lar­ity—a place where ex­trap­o­la­tion breaks down and new mod­els must be ap­plied—and the world will pass be­yond our un­der­stand­ing. -- Ver­nor Vinge, True Names and other Dangers, p. 47.

Vingean un­pre­dictabil­ity is a key part of how we think about a con­se­quen­tial­ist in­tel­li­gence which we be­lieve is smarter than us in a do­main. In par­tic­u­lar, we usu­ally think we can’t pre­dict ex­actly what a smarter-than-us agent will do, be­cause if we could pre­dict that, we would be that smart our­selves (Vinge’s Prin­ci­ple).

If you could pre­dict ex­actly what ac­tion Deep Blue would take on a chess­board, you could play as well as Deep Blue by mak­ing what­ever move you pre­dicted Deep Blue would make. It fol­lows that Deep Blue’s pro­gram­mers nec­es­sar­ily sac­ri­ficed their abil­ity to in­tuit Deep Blue’s ex­act moves in ad­vance, in the course of cre­at­ing a su­per­hu­man chess­player.

But this doesn’t mean Deep Blue’s pro­gram­mers were con­fused about the crite­rion by which Deep Blue chose ac­tions. Deep Blue’s pro­gram­mers still knew in ad­vance that Deep Blue would try to win rather than lose chess games. They knew that Deep Blue would try to steer the chess board’s fu­ture into a par­tic­u­lar re­gion that was high in Deep Blue’s prefer­ence or­der­ing over chess po­si­tions. We can pre­dict the con­se­quences of Deep Blue’s moves bet­ter than we can pre­dict the moves them­selves.

“Vingean un­cer­tainty” is the pe­cu­liar epistemic state we en­ter when we’re con­sid­er­ing suffi­ciently in­tel­li­gent pro­grams; in par­tic­u­lar, we be­come less con­fi­dent that we can pre­dict their ex­act ac­tions, and more con­fi­dent of the fi­nal out­come of those ac­tions.

(Note that this re­jects the claim that we are epistem­i­cally hel­pless and can know noth­ing about be­ings smarter than our­selves.)

Fur­ther­more, our abil­ity to think about agents smarter than our­selves is not limited to know­ing a par­tic­u­lar goal and pre­dict­ing its achieve­ment. If we found a gi­ant alien ma­chine that seemed very well-de­signed, we might be able to in­fer the aliens were su­per­hu­manly in­tel­li­gent even if we didn’t know the aliens’ ul­ti­mate goals. If we saw metal pipes, we could guess that the pipes rep­re­sented some sta­ble, op­ti­mal me­chan­i­cal solu­tion which was made out of hard metal so as to re­tain its shape. If we saw su­per­con­duct­ing ca­bles, we could guess that this was a way of effi­ciently trans­port­ing elec­tri­cal work from one place to an­other, even if we didn’t know what fi­nal pur­pose the elec­tric­ity was be­ing used for. This is the idea be­hind In­stru­men­tal con­ver­gence: if we can rec­og­nize that an alien ma­chine is effi­ciently har­vest­ing and dis­tribut­ing en­ergy, we might rec­og­nize it as an in­tel­li­gently de­signed ar­ti­fact in the ser­vice of some goal even if we don’t know the goal.

Non­con­tain­ment of be­lief within the ac­tion probabilities

When rea­son­ing un­der Vingean un­cer­tainty, due to our lack of log­i­cal om­ni­science, our be­liefs about the con­se­quences of the agent’s ac­tions are not fully con­tained in our prob­a­bil­ity dis­tri­bu­tion over the agent’s ac­tions.

Sup­pose that on each turn of a chess game play­ing against Deep Blue, I ask you to put a prob­a­bil­ity dis­tri­bu­tion on Deep Blue’s pos­si­ble chess moves. If you are a ra­tio­nal agent you should be able to put a well-cal­ibrated prob­a­bil­ity dis­tri­bu­tion on these moves—most triv­ially, by as­sign­ing ev­ery le­gal move an equal prob­a­bil­ity (if Deep Blue has 20 le­gal moves, and you as­sign each move 5% prob­a­bil­ity, you are guaran­teed to be well-cal­ibrated).

Now imag­ine a ran­dom­ized game player Ran­domBlue that, on each round, draws ran­domly from the prob­a­bil­ity dis­tri­bu­tion you’d as­sign to Deep Blue’s move from the same chess po­si­tion. In ev­ery turn, your be­lief about where you’ll ob­serve Ran­domBlue move, is equiv­a­lent to your be­lief about where you’d see Deep Blue move. But your be­lief about the prob­a­ble end of the game is very differ­ent. (This is only pos­si­ble due to your lack of log­i­cal om­ni­science—you lack the com­put­ing re­sources to map out the com­plete se­quence of ex­pected moves, from your be­liefs about each po­si­tion.)

In par­tic­u­lar, we could draw the fol­low­ing con­trast be­tween your rea­son­ing about Deep Blue and your rea­son­ing about Ran­domBlue:

  • When you see Deep Blue make a move to which you as­signed a low prob­a­bil­ity, you think the rest of the game will go worse for you than you ex­pected (that is, Deep Blue will do bet­ter than you pre­vi­ously ex­pected).

  • When you see Ran­domBlue make a move that you as­signed a low prob­a­bil­ity (i.e., a low prob­a­bil­ity that Deep Blue would make that move in that po­si­tion), you ex­pect to beat Ran­domBlue sooner than you pre­vi­ously ex­pected (things will go worse for Ran­domBlue than your pre­vi­ous av­er­age ex­pec­ta­tion).

This re­flects our be­lief in some­thing like the in­stru­men­tal effi­ciency of Deep Blue. When we es­ti­mate the prob­a­bil­ity that Deep Blue makes a move \(x\), we’re es­ti­mat­ing the prob­a­bil­ity that, as Deep Blue es­ti­mated each move \(y\)’s ex­pected prob­a­bil­ity of win­ning \(EU[y]\), Deep Blue found \(\forall y \neq x: EU[x] > EU[y]\) (ne­glect­ing the pos­si­bil­ity of ex­act ties, which is un­likely with deep searches and float­ing-point po­si­tion-value es­ti­mates). If Deep Blue picks \(z\) in­stead of \(x\), we know that Deep Blue es­ti­mated \(\forall y \neq z: EU[z] > EU[y]\) and in par­tic­u­lar that Deep Blue es­ti­mated \(EU[z] > EU[x]\). This could be be­cause the ex­pected worth of \(x\) to Deep Blue was less than ex­pected, but for low-prob­a­bil­ity move \(z\) to be bet­ter than all other moves as well im­plies that \(z\) had an un­ex­pect­edly high value rel­a­tive to our own es­ti­mates. Thus, when Deep Blue makes a very un­ex­pected move, we mostly ex­pect that Deep Blue saw an un­ex­pect­edly good move that was bet­ter than what we thought was the best available move.

In con­trast, when Ran­domBlue makes an un­ex­pected move, we think the ran­dom num­ber gen­er­a­tor hap­pened to land on a move that we justly as­signed low worth, and hence we ex­pect to defeat Ran­domBlue faster than we oth­er­wise would have.

Fea­tures of Vingean reasoning

Some in­ter­est­ing fea­tures of rea­son­ing un­der Vingean un­cer­tainty:

  • We may find our­selves more con­fi­dent of the pre­dicted con­se­quences of an ac­tion than of the pre­dicted ac­tion.

  • We may be more sure about the agent’s in­stru­men­tal strate­gies than its goals.

  • Due to our lack of log­i­cal om­ni­science, our be­liefs about the agent’s ac­tion-me­di­ated re­la­tion to the en­vi­ron­ment are not screened off by our prob­a­bil­ity dis­tri­bu­tion over the sys­tem’s prob­a­ble next ac­tions.

    • We up­date on the prob­a­ble con­se­quence of an ac­tion, and on the prob­a­ble con­se­quences of other ac­tions not taken, af­ter ob­serv­ing that the agent ac­tu­ally out­puts that ac­tion.

  • If there is a com­pact way to de­scribe the pre­vi­ous con­se­quences of the agent’s pre­vi­ous ac­tions, we might try to in­fer that this con­se­quence is a goal of the agent. We might then pre­dict similar con­se­quences in the fu­ture, even with­out be­ing able to pre­dict the agent’s spe­cific next ac­tions.

Our ex­pec­ta­tion of Vingean un­pre­dictabil­ity in a do­main may break down if the do­main is ex­tremely sim­ple and suffi­ciently closed. In this case there may be an op­ti­mal play that we already know, mak­ing su­per­hu­man (un­pre­dictable) play im­pos­si­ble.

Cog­ni­tive uncontainability

Vingean un­pre­dictabil­ity is one of the core rea­sons to ex­pect cog­ni­tive un­con­tain­abil­ity in suffi­ciently in­tel­li­gent agents.

Vingean reflection

Vingean re­flec­tion is rea­son­ing about cog­ni­tive sys­tems, es­pe­cially cog­ni­tive sys­tems very similar to your­self (in­clud­ing your ac­tual self), un­der the con­straint that you can’t pre­dict the ex­act fu­ture out­puts. Deep Blue’s pro­gram­mers, by rea­son­ing about the way Deep Blue was search­ing through game trees, could ar­rive at a well-jus­tified but ab­stract be­lief that Deep Blue was ‘try­ing to win’ (rather than try­ing to lose) and rea­son­ing effec­tively to that end.

In Vingean re­flec­tion we need to make pre­dic­tions about the con­se­quence of op­er­at­ing an agent in an en­vi­ron­ment, with­out know­ing the agent’s ex­act fu­ture ac­tions—pre­sum­ably via rea­son­ing on some more ab­stract level, some­how. In tiling agents the­ory, Vinge’s Prin­ci­ple ap­pears in the rule that we should talk about our suc­ces­sor’s spe­cific ac­tions only in­side of quan­tifiers.

“Vingean re­flec­tion” may be a much more gen­eral is­sue in the de­sign of ad­vanced cog­ni­tive sys­tems than it might ap­pear at first glance. An agent rea­son­ing about the con­se­quences of its cur­rent code, or con­sid­er­ing what will hap­pen if it spends an­other minute think­ing, can be viewed as do­ing Vingean re­flec­tion. Vingean re­flec­tion can also be seen as the study of how a given agent wants think­ing to oc­cur in cog­ni­tive com­pu­ta­tions, which may be im­por­tantly differ­ent from how the agent cur­rently thinks. (If these two co­in­cide, we say the agent is re­flec­tively sta­ble.)

Tiling agents the­ory is presently the main line of re­search try­ing to (slowly) get started on for­mal­iz­ing Vingean re­flec­tion and re­flec­tive sta­bil­ity.

Children:

  • Vinge's Law

    You can’t pre­dict ex­actly what some­one smarter than you would do, be­cause if you could, you’d be that smart your­self.

  • Deep Blue

    The chess-play­ing pro­gram, built by IBM, that first won the world chess cham­pi­onship from Garry Kas­parov in 1996.

Parents:

  • Advanced agent properties

    How smart does a ma­chine in­tel­li­gence need to be, for its nice­ness to be­come an is­sue? “Ad­vanced” is a broad term to cover cog­ni­tive abil­ities such that we’d need to start con­sid­er­ing AI al­ign­ment.