Relevant powerful agent

A cognitively powerful agent is relevant if it is cognitively powerful enough to be a game-changer in the larger dilemma faced by Earth-originating intelligent life. Conversely, an agent is irrelevant if it is too weak to make much of a difference, or if the cognitive problems it can solve or tasks it is authorized to perform don’t significantly change the overall situation we face.

Definition

Intuitively speaking, a value-aligned AI is ‘relevant’ to the extent it has a ‘yes’ answer to the box ‘Can we use this AI to produce a benefit that solves the larger dilemma?’ in this flowchart, or is part of a larger plan that gets us to the lower green circle without any “Then a miracle occurs” steps. A cognitively powerful agent is relevant at all if its existence can effectuate good or bad outcomes—e.g., a big neural net is not ‘relevant’ because it doesn’t end the world one way or another.

(A better word than ‘relevant’ might be helpful here.)

Examples

Positive examples

  • A hypothetical agent that can bootstrap to nanotechnology by solving the inverse protein folding problem and shut down other AI projects, in a way that can reasonably be known safe enough to authorize by the AI’s programmers, would be relevant.

Negative examples

  • An agent authorized to prove or disprove the Riemann Hypothesis, but not to do anything else, is not relevant (unless knowing whether the Riemann Hypothesis is true somehow changes everything for the basic dilemma of AI).

  • An oracle that can only output verified HOL proofs is not yet ‘relevant’ until someone can describe theorems to prove such that firm knowledge of their truth would be a game-changer for the AI situation. (Hypothesizing that someone else will come up with a theorem like that, if you just build the oracle, is a hail Mary step in the plan.)

Importance

Many proposals for AI safety, especially advanced safety, so severely restrict the applicability of the AI that the AI is no longer allowed to do anything that seems like it could solve the larger dilemma. (E.g., an oracle that is only allowed to give us binary answers for whether it thinks certain mathematical facts are true, and nobody has yet said how to use this ability to save the world.)

Conversely, proposals to use AIs to do things impactful enough to solve the larger dilemma, generally run smack into all the usual advanced safety problems, especially if the AI must operate in the rich domain of the real world to carry out the task (this tends to require full trust).

Open problem

It is an open problem to propose a relevant, limited AI that would be significantly easier to handle than the general safety problem, while also being useful enough to resolve the larger .

Parents:

  • AI alignment

    The great civilizational problem of creating artificially intelligent computer systems such that running them is a good idea.