Value identification problem
The subproblem category of value alignment which deals with pinpointing valuable outcomes to an advanced agent and distinguishing them from non-valuable outcomes. E.g., the Edge Instantiation and Ontology Identification problems are argued to be foreseeable difficulties of value identification. A central foreseen difficulty of value identification is Complexity of Value.
Children:
- Happiness maximizer
- Edge instantiation
When you ask the AI to make people happy, and it tiles the universe with the smallest objects that can be happy.
- Identifying causal goal concepts from sensory data
If the intended goal is “cure cancer” and you show the AI healthy patients, it sees, say, a pattern of pixels on a webcam. How do you get to a goal concept about the real patients?
- Goal-concept identification
Figuring out how to say “strawberry” to an AI that you want to bring you strawberries (and not fake plastic strawberries, either).
- Ontology identification problem
How do we link an agent’s utility function to its model of the world, when we don’t know what that model will look like?
- Environmental goals
The problem of having an AI want outcomes that are out in the world, not just want direct sense events.
Parents:
- AI alignment
The great civilizational problem of creating artificially intelligent computer systems such that running them is a good idea.