Task-directed AGI

A task-based AGI is an AGI intended to follow a series of human-originated orders, with these orders each being of limited scope—“satisficing” in the sense that they can be accomplished using bounded amounts of effort and resources (as opposed to the goals being more and more fulfillable using more and more effort).

In Bostrom’s typology, this is termed a “Genie”. It contrasts with a “Sovereign” AGI that acts autonomously in the pursuit of long-term real-world goals.

Building a safe Task AGI might be easier than building a safe Sovereign for the following reasons:

  • A Task AGI can be “online”; the AGI can potentially query the user before and during Task performance. (Assuming an ambiguous situation arises, and is successfully identified as ambiguous.)

  • A Task AGI can potentially be limited in various ways, since a Task AGI doesn’t need to be as powerful as possible in order to accomplish its limited-scope Tasks. A Sovereign would presumably engage in all-out self-improvement. (This isn’t to say Task AGIs would automatically not self-improve, only that it’s possible in principle to limit the power of a Task AGI to only the level required to do the targeted Tasks, if the associated safety problems can be solved.)

  • Tasks, by assumption, are limited in scope—they can be accomplished and done, inside some limited region of space and time, using some limited amount of effort which is then complete. (To gain this advantage, a state of Task accomplishment should not go higher and higher in preference as more and more effort is expended on it open-endedly.)

  • Assuming that users can figure out intended goals for the AGI that are valuable and pivotal, the identification problem for describing what constitutes a safe performance of that Task, might be simpler than giving the AGI a complete description of normativity in general. That is, the problem of communicating to an AGI an adequate description of “cure cancer” (without killing patients or causing other side effects), while still difficult, might be simpler than an adequate description of all normative value. Task AGIs fall on the narrow side of Ambitious vs. narrow value learning.

Relative to the problem of building a Sovereign, trying to build a Task AGI instead might step down the problem from “impossibly difficult” to “insanely difficult”, while still maintaining enough power in the AI to perform pivotal acts.

The obvious disadvantage of a Task AGI is moral hazard—it may tempt the users in ways that a Sovereign would not. A Sovereign has moral hazard chiefly during the development phase, when the programmers and users are perhaps not yet in a position of special relative power. A Task AGI has ongoing moral hazard as it is used.

Eliezer Yudkowsky has suggested that people only confront many important problems in value alignment when they are thinking about Sovereigns, but that at the same time, Sovereigns may be impossibly hard in practice. Yudkowsky advocates that people think about Sovereigns first and list out all the associated issues before stepping down their thinking to Task AGIs, because thinking about Task AGIs may result in premature pruning, while thinking about Sovereigns is more likely to generate a complete list of problems that can then be checked against particular Task AGI approaches to see if those problems have become any easier.

Three distinguished subtypes of Task AGI are these:

  • Oracles, an AI that is intended to only answer questions, possibly from some restricted question set.

  • Known-algorithm AIs, which are not self-modifying or very weakly self-modifying, such that their algorithms and representations are mostly known and mostly stable.

  • Behaviorist Genies, which are meant to not model human minds or model them in only very limited ways, while having great material understanding (e.g., potentially the ability to invent and deploy nanotechnology).

Subproblems

The problem of making a safe genie invokes numerous subtopics such as low impact, mild optimization, and conservatism as well as numerous standard AGI safety problems like reflective stability and safe identification of intended goals.

(See here for a separate page on open problems in Task AGI safety that might be ready for current research.)

Some further problems beyond those appearing in the page above are:

Children:

  • Behaviorist genie

    An advanced agent that’s forbidden to model minds in too much detail.

  • Epistemic exclusion

    How would you build an AI that, no matter what else it learned about the world, never knew or wanted to know what was inside your basement?

  • Open subproblems in aligning a Task-based AGI

    Open research problems, especially ones we can model today, in building an AGI that can “paint all cars pink” without turning its future light cone into pink-painted cars.

  • Low impact

    The open problem of having an AI carry out tasks in ways that cause minimum side effects and change as little of the rest of the universe as possible.

  • Conservative concept boundary

    Given N example burritos, draw a boundary around what is a ‘burrito’ that is relatively simple and allows as few positive instances as possible. Helps make sure the next thing generated is a burrito.

  • Querying the AGI user

    Postulating that an advanced agent will check something with its user, probably comes with some standard issues and gotchas (e.g., prioritizing what to query, not manipulating the user, etc etc).

  • Mild optimization

    An AGI which, if you ask it to paint one car pink, just paints one car pink and doesn’t tile the universe with pink-painted cars, because it’s not trying that hard to max out its car-painting score.

  • Task identification problem

    If you have a task-based AGI (Genie) then how do you pinpoint exactly what you want it to do (and not do)?

  • Safe plan identification and verification

    On a particular task or problem, the issue of how to communicate to the AGI what you want it to do and all the things you don’t want it to do.

  • Faithful simulation

    How would you identify, to a Task AGI (aka Genie), the problem of scanning a human brain, and then running a sufficiently accurate simulation of it for the simulation to not be crazy or psychotic?

  • Task (AI goal)

    When building the first AGIs, it may be wiser to assign them only goals that are bounded in space and time, and can be satisfied by bounded efforts.

  • Limited AGI

    Task-based AGIs don’t need unlimited cognitive and material powers to carry out their Tasks; which means their powers can potentially be limited.

  • Oracle

    System designed to safely answer questions.

  • Boxed AI

    Idea: what if we limit how AI can interact with the world. That’ll make it safe, right??

Parents:

  • Strategic AGI typology

    What broad types of advanced AIs, corresponding to which strategic scenarios, might it be possible or wise to create?

  • AI alignment

    The great civilizational problem of creating artificially intelligent computer systems such that running them is a good idea.