Known-algorithm non-self-improving agent

“Known-al­gorithm non-self-im­prov­ing” (KANSI) is a strate­gic sce­nario and class of pos­si­bly-at­tain­able AI de­signs, where the first pivotal pow­er­ful AI has been con­structed out of known, hu­man-un­der­stood al­gorithms and is not en­gag­ing in ex­ten­sive self-mod­ifi­ca­tion. Its ad­van­tage might be achieved by, e.g., be­ing run on a very large cluster of com­put­ers. If you could build an AI that was ca­pa­ble of crack­ing pro­tein fold­ing and build­ing nan­otech­nol­ogy, by run­ning cor­rectly struc­tured al­gorithms akin to deep learn­ing on Google’s or Ama­zon’s com­put­ing cluster, and the builders were suffi­ciently para­noid/​sen­si­ble to have peo­ple con­tin­u­ously mon­i­tor­ing this AI’s pro­cesses and all the prob­lems it was try­ing to solve and not hav­ing this AI en­gage in self-mod­ifi­ca­tion or self-im­prove­ment, this would fall into the KANSI class of sce­nar­ios. This would im­ply that huge classes of prob­lems in re­flec­tive sta­bil­ity, on­tol­ogy iden­ti­fi­ca­tion, limit­ing po­ten­tially dan­ger­ous ca­pa­bil­ities, etcetera, would be much sim­pler.

Restrict­ing ‘good’ or ap­proved AI de­vel­op­ment to KANSI de­signs would mean de­liber­ately fore­go­ing what­ever ca­pa­bil­ity gains might be pos­si­ble through self-im­prove­ment. It’s not known whether a KANSI AI could be first to some pivotal level of ca­pa­bil­ity. This would de­pend on un­known back­ground set­tings about how much ca­pa­bil­ity can be gained, at what stage, by self-mod­ifi­ca­tion. Depend­ing on these back­ground vari­ables, mak­ing a KANSI AI be first to a ca­pa­bil­ity thresh­old might or might not be some­thing that could be ac­com­plished by any rea­son­able level of effort and co­or­di­na­tion. This is one rea­son among sev­eral why MIRI does not, e.g., re­strict its at­ten­tion to KANSI de­signs.

Just in­tend­ing to build a non-self-im­prov­ing AI out of known al­gorithms is in­suffi­cient to en­sure KANSI as a prop­erty; this might re­quire fur­ther solu­tions along the lines of Cor­rigi­bil­ity. E.g., hu­mans can’t mod­ify their own brain func­tions, but be­cause we’re gen­eral con­se­quen­tial­ists and we don’t always think the way we want to think, we cre­ated quite sim­ple in­no­va­tions like, e.g., calcu­la­tors, out of en­vi­ron­men­tal ob­jects in a world that didn’t have any built-in calcu­la­tors, so that we could think about ar­ith­metic in a differ­ent way than we did by de­fault. A KANSI de­sign with a large di­ver­gence be­tween how it thinks and how it wants to think might be­have similarly, or re­quire con­stant su­per­vi­sion to de­tect most cases of the AI start­ing to be­have similarly—and then some cases might slip through the cracks. Since our pre­sent study and un­der­stand­ing of re­flec­tive sta­bil­ity is very prim­i­tive, we’re plau­si­bly still in the field of things we should be study­ing even if we want to build a KANSI agent, just to have the KANSI agent not be too wildly di­ver­gent in dis­tance be­tween how it thinks about X, and how it would pre­fer to think about X if given the choice.

Parents:

  • Strategic AGI typology

    What broad types of ad­vanced AIs, cor­re­spond­ing to which strate­gic sce­nar­ios, might it be pos­si­ble or wise to cre­ate?