Object-level vs. indirect goals
An ‘object-level goal’ is a goal that involves no indirection, doesn’t require any further computation or observation to be fully specified, and is evaluated directly on events or things in the agent’s model of the universe. Contrast to meta-level or indirect goals.
Some examples of object-level goals might be these:
Win a chess game.
Cause a pawn to advance to the eighth row, in order to win a chess game.
Here are some example cases of a goal or preference framework with properties that make them not object level:
“Observe Alice, model what she wants, then do that.”
This framework is indirect because it doesn’t directly say what events or things in the universe are good or bad, but rather gives a recipe for deciding which events are good or bad (namely, model Alice).
This framework is not fully locally specified, because we have to observe Alice, and maybe compute further based on our observations of her, before we find out the actual evaluator we run to weigh events as good or bad.
“Induce a compact category covering the proximal causes of sensory data labeled as positive instances and not covering sensory data labeled as negative instances. The utility of an outcome is the number of events classified as positive by the induced category.”
This framework is not fully specified because it is based on a supervised dataset, and the actual goals will vary with the dataset obtained. We don’t know what the agent wants when we’re told about the induction algorithm; we also have to be told about the dataset, and we haven’t been told about that yet.
“Compute an average of what all superintelligences will want, relative to some prior distribution over the origins of superintelligences, and then do that.”
This framework might be computable locally without any more observations, but it still involves a level of indirection, and is not locally complete in the sense that it would take a whole lot more computation before the agent knew what it ought to think of eating an apple.
The object-level vs. meta-level distinction should not be confused with the terminal vs. instrumental distinction.
- AI alignment
The great civilizational problem of creating artificially intelligent computer systems such that running them is a good idea.