Development phase unpredictable

Several proposed problems in advanced safety are alleged to be difficult because they depend on some property of a mature agent that is allegedly hard to predict in advance at the time we are designing, teaching, or testing the agent. We say that such properties are allegedly ‘development phase unpredictable’. For example, the Unforeseen Maximums problem arises when we can’t search a rich solution space as widely as an advanced agent, making it development-phase unpredictable which real-world strategy or outcome state will maximize some formal utility function.


  • Unforeseen maximum

    When you tell AI to produce world peace and it kills everyone. (Okay, some SF writers saw that one coming.)


  • AI alignment

    The great civilizational problem of creating artificially intelligent computer systems such that running them is a good idea.