Object-level vs. indirect goals

An ‘ob­ject-level goal’ is a goal that in­volves no in­di­rec­tion, doesn’t re­quire any fur­ther com­pu­ta­tion or ob­ser­va­tion to be fully speci­fied, and is eval­u­ated di­rectly on events or things in the agent’s model of the uni­verse. Con­trast to meta-level or in­di­rect goals.

Some ex­am­ples of ob­ject-level goals might be these:

Here are some ex­am­ple cases of a goal or prefer­ence frame­work with prop­er­ties that make them not ob­ject level:

  • “Ob­serve Alice, model what she wants, then do that.”

  • This frame­work is in­di­rect be­cause it doesn’t di­rectly say what events or things in the uni­verse are good or bad, but rather gives a recipe for de­cid­ing which events are good or bad (namely, model Alice).

  • This frame­work is not fully lo­cally speci­fied, be­cause we have to ob­serve Alice, and maybe com­pute fur­ther based on our ob­ser­va­tions of her, be­fore we find out the ac­tual eval­u­a­tor we run to weigh events as good or bad.

  • “In­duce a com­pact cat­e­gory cov­er­ing the prox­i­mal causes of sen­sory data la­beled as pos­i­tive in­stances and not cov­er­ing sen­sory data la­beled as nega­tive in­stances. The util­ity of an out­come is the num­ber of events clas­sified as pos­i­tive by the in­duced cat­e­gory.”

  • This frame­work is not fully speci­fied be­cause it is based on a su­per­vised dataset, and the ac­tual goals will vary with the dataset ob­tained. We don’t know what the agent wants when we’re told about the in­duc­tion al­gorithm; we also have to be told about the dataset, and we haven’t been told about that yet.

  • “Com­pute an av­er­age of what all su­per­in­tel­li­gences will want, rel­a­tive to some prior dis­tri­bu­tion over the ori­gins of su­per­in­tel­li­gences, and then do that.”

  • This frame­work might be com­putable lo­cally with­out any more ob­ser­va­tions, but it still in­volves a level of in­di­rec­tion, and is not lo­cally com­plete in the sense that it would take a whole lot more com­pu­ta­tion be­fore the agent knew what it ought to think of eat­ing an ap­ple.

The ob­ject-level vs. meta-level dis­tinc­tion should not be con­fused with the ter­mi­nal vs. in­stru­men­tal dis­tinc­tion.


  • AI alignment

    The great civ­i­liza­tional prob­lem of cre­at­ing ar­tifi­cially in­tel­li­gent com­puter sys­tems such that run­ning them is a good idea.