Cognitive uncontainability

Vingean un­pre­dictabil­ity is when an agent is cog­ni­tively un­con­tain­able be­cause it is smarter than us: if you could pre­dict in ad­vance ex­actly where Deep Blue would move, you could play chess at least as well as Deep Blue your­self by do­ing what­ever you pre­dicted Deep Blue would do in your shoes.

Although Vingean un­pre­dictabil­ity is the clas­sic way in which cog­ni­tive un­con­tain­abil­ity can arise, other pos­si­bil­ities are imag­in­able. For in­stance, the AI could be op­er­at­ing in a rich do­main and search­ing a differ­ent part of the search space that hu­mans have difficulty han­dling, while still be­ing dumber or less com­pe­tent over­all than a hu­man. In this case the AI’s strate­gies might still be un­pre­dictable to us, even while it was less effec­tive or com­pe­tent over­all. Most anec­dotes about AI al­gorithms do­ing sur­pris­ing things can be viewed from this an­gle.

An ex­tremely nar­row, ex­haustibly search­able do­main may yield cog­ni­tive con­tain­abil­ity even for in­tel­li­gence lo­cally su­pe­rior to a hu­man’s. Even a perfect Tic-Tac-Toe player can only draw against a hu­man who knows the ba­sic strate­gies, be­cause hu­mans can also play perfect Tic-Tac-Toe. Of course this is only true so long as the agent can’t mod­u­late some tran­sis­tors to form a wire­less ra­dio, es­cape onto the In­ter­net, and offer a nearby by­stan­der twenty thou­sand dol­lars to punch the hu­man in the face—in which case the agent’s strate­gic op­tions would have in­cluded, in ret­ro­spect, things that af­fected the real world; and the real world is a much more com­pli­cated do­main than Tic-Tac-Toe. There’s some sense in which richer do­mains seem likely to feed into in­creased cog­ni­tive un­con­tain­abil­ity, but it’s worth re­mem­ber­ing that ev­ery game and ev­ery com­puter is em­bed­ded into the ex­tremely com­pli­cated real world.

Strong cog­ni­tive un­con­tain­abil­ity is when the agent knows some facts we don’t, that it can use to for­mu­late strate­gies that we wouldn’t be able to rec­og­nize in ad­vance as suc­cess­ful. From the per­spec­tive of e.g. the 11th cen­tury C.E. try­ing to cool their house, bring­ing in cool wa­ter from the nearby river to run over some nearby sur­faces might be an un­der­stand­able solu­tion; but if you showed them the sketch of an air con­di­tioner, with­out run­ning the air con­di­tioner or ex­plain­ing how it worked, they wouldn’t rec­og­nize this sketch as a smart solu­tion be­cause they wouldn’t know the fur­ther facts re­quired to see why it would work. When an agent can win us­ing op­tions that we didn’t imag­ine, couldn’t in­vent, and wouldn’t un­der­stand even if we caught a glimpse of them in ad­vance, it is strongly cog­ni­tively un­con­tain­able in the same way that the 21st cen­tury is strongly un­con­tain­able from the stand­point of the 11th cen­tury.



  • Advanced agent properties

    How smart does a ma­chine in­tel­li­gence need to be, for its nice­ness to be­come an is­sue? “Ad­vanced” is a broad term to cover cog­ni­tive abil­ities such that we’d need to start con­sid­er­ing AI al­ign­ment.