Abortable plans

“Abortable” plans are those which can readily be switched to having very low net impact on the world. Suppose an AI is told to paint a car pink. The AI starts to do so by constructing replicating nanomachines that will paint the car pink. If the AI has successfully been given a shutdown utility function and shutdown button, we can press the button to have the AI switch off or suspend itself to disk and take no further action, but this might not affect the nanomachines already made. An AI with an abort button will have constructed the nanomachines such that at any time the AI can be given the “abort” instruction, which will with a minimum of further action on the AI’s part cause all the nanomachines (including any replicated ones) to quietly self-destruct. That is, the AI has already planned such that the partial execution of the original plan, plus the activation midway of the abort subplan, will together have minimum impact on the world.

Parents:

  • Low impact

    The open problem of having an AI carry out tasks in ways that cause minimum side effects and change as little of the rest of the universe as possible.