Cognitive uncontainability

Vingean unpredictability is when an agent is cognitively uncontainable because it is smarter than us: if you could predict in advance exactly where Deep Blue would move, you could play chess at least as well as Deep Blue yourself by doing whatever you predicted Deep Blue would do in your shoes.

Although Vingean unpredictability is the classic way in which cognitive uncontainability can arise, other possibilities are imaginable. For instance, the AI could be operating in a rich domain and searching a different part of the search space that humans have difficulty handling, while still being dumber or less competent overall than a human. In this case the AI’s strategies might still be unpredictable to us, even while it was less effective or competent overall. Most anecdotes about AI algorithms doing surprising things can be viewed from this angle.

An extremely narrow, exhaustibly searchable domain may yield cognitive containability even for intelligence locally superior to a human’s. Even a perfect Tic-Tac-Toe player can only draw against a human who knows the basic strategies, because humans can also play perfect Tic-Tac-Toe. Of course this is only true so long as the agent can’t modulate some transistors to form a wireless radio, escape onto the Internet, and offer a nearby bystander twenty thousand dollars to punch the human in the face—in which case the agent’s strategic options would have included, in retrospect, things that affected the real world; and the real world is a much more complicated domain than Tic-Tac-Toe. There’s some sense in which richer domains seem likely to feed into increased cognitive uncontainability, but it’s worth remembering that every game and every computer is embedded into the extremely complicated real world.

Strong cognitive uncontainability is when the agent knows some facts we don’t, that it can use to formulate strategies that we wouldn’t be able to recognize in advance as successful. From the perspective of e.g. the 11th century C.E. trying to cool their house, bringing in cool water from the nearby river to run over some nearby surfaces might be an understandable solution; but if you showed them the sketch of an air conditioner, without running the air conditioner or explaining how it worked, they wouldn’t recognize this sketch as a smart solution because they wouldn’t know the further facts required to see why it would work. When an agent can win using options that we didn’t imagine, couldn’t invent, and wouldn’t understand even if we caught a glimpse of them in advance, it is strongly cognitively uncontainable in the same way that the 21st century is strongly uncontainable from the standpoint of the 11th century.



  • Advanced agent properties

    How smart does a machine intelligence need to be, for its niceness to become an issue? “Advanced” is a broad term to cover cognitive abilities such that we’d need to start considering AI alignment.