Shutdown problem

The ‘shutdown problem’ is creating a sufficiently advanced Artificial Intelligence which will, on the press of a button, suspend itself safely to disk; such that the AI does not object to the existence of this button, nor try to prevent it from being pressed.

One possible research avenue is to:

Describe a ‘shutdown utility function’ whose attainable optimum is “Suspend safely to disk (and possibly safely abort plans in progress), without building a hidden environmental copy of yourself to carry out your previous goal and then assimilate all matter in the universe to make absolutely sure the original AI stays shut down”.
Find a sensible way to compose the shutdown utility function with the agent’s regular utility function, such that:
- (2a) Which utility function the agent optimizes depends on whether a switch was pressed.
- (2b) The AI experiences no incentive to cause the switch to be pressed or prevent the switch from being pressed.

Harder versions of this problem would add specifications such as:

The AI positively wants the switch to exist and will restore the switch’s existence if it is damaged.
The AI will safely abort any plans in progress (and design plans such that they can be safely aborted).