Known-algorithm non-self-improving agent

“Known-algorithm non-self-improving” (KANSI) is a strategic scenario and class of possibly-attainable AI designs, where the first pivotal powerful AI has been constructed out of known, human-understood algorithms and is not engaging in extensive self-modification. Its advantage might be achieved by, e.g., being run on a very large cluster of computers. If you could build an AI that was capable of cracking protein folding and building nanotechnology, by running correctly structured algorithms akin to deep learning on Google’s or Amazon’s computing cluster, and the builders were sufficiently paranoid/​sensible to have people continuously monitoring this AI’s processes and all the problems it was trying to solve and not having this AI engage in self-modification or self-improvement, this would fall into the KANSI class of scenarios. This would imply that huge classes of problems in reflective stability, ontology identification, limiting potentially dangerous capabilities, etcetera, would be much simpler.

Restricting ‘good’ or approved AI development to KANSI designs would mean deliberately foregoing whatever capability gains might be possible through self-improvement. It’s not known whether a KANSI AI could be first to some pivotal level of capability. This would depend on unknown background settings about how much capability can be gained, at what stage, by self-modification. Depending on these background variables, making a KANSI AI be first to a capability threshold might or might not be something that could be accomplished by any reasonable level of effort and coordination. This is one reason among several why MIRI does not, e.g., restrict its attention to KANSI designs.

Just intending to build a non-self-improving AI out of known algorithms is insufficient to ensure KANSI as a property; this might require further solutions along the lines of Corrigibility. E.g., humans can’t modify their own brain functions, but because we’re general consequentialists and we don’t always think the way we want to think, we created quite simple innovations like, e.g., calculators, out of environmental objects in a world that didn’t have any built-in calculators, so that we could think about arithmetic in a different way than we did by default. A KANSI design with a large divergence between how it thinks and how it wants to think might behave similarly, or require constant supervision to detect most cases of the AI starting to behave similarly—and then some cases might slip through the cracks. Since our present study and understanding of reflective stability is very primitive, we’re plausibly still in the field of things we should be studying even if we want to build a KANSI agent, just to have the KANSI agent not be too wildly divergent in distance between how it thinks about X, and how it would prefer to think about X if given the choice.


  • Strategic AGI typology

    What broad types of advanced AIs, corresponding to which strategic scenarios, might it be possible or wise to create?