Limited AGI

One of the rea­sons why a Task AGI can po­ten­tially be safer than an Au­tonomous AGI, is that since Task AGIs only need to carry out ac­tivi­ties of limited scope, they may only need limited ma­te­rial and cog­ni­tive pow­ers to carry out those tasks. The non­ad­ver­sar­ial prin­ci­ple still ap­plies, but takes the form of “don’t run the search” rather than “make sure the search re­turns the cor­rect an­swer”.

Obstacles

• In­creas­ing your ma­te­rial and cog­ni­tive effi­cacy is in­stru­men­tally con­ver­gent in all sorts of places and would pre­sum­ably need to be averted all over the place.

• Good limi­ta­tion pro­pos­als are not as easy as they look be­cause par­tic­u­lar do­main ca­pa­bil­ities can of­ten be de­rived from more gen­eral ar­chi­tec­tures. An Ar­tifi­cial Gen­eral In­tel­li­gence doesn’t have a hand­crafted ‘think­ing about cars’ mod­ule and a hand­crafted ‘think­ing about planes’ mod­ule, so you can’t just hand­craft the two mod­ules at differ­ent lev­els of abil­ity.

E.g. many have sug­gested that ‘drive’ or ‘emo­tion’ is some­thing that can be se­lec­tively re­moved from AGIs to ‘limit’ their am­bi­tions; pre­sum­ably these peo­ple are us­ing a men­tal model that is not the stan­dard ex­pected util­ity agent model. To know which kind of limi­ta­tions are easy, you need a suffi­ciently good back­ground pic­ture of the AGI’s sub­pro­cesses that you un­der­stand which kind of sys­tem ca­pa­bil­ities will nat­u­rally carve at the joints.

Re­lated ideas

The re­search av­enue of Mild op­ti­miza­tion can be viewed as pur­su­ing a kind of very gen­eral Limi­ta­tion.

Be­hav­iorism asks to Limit the AGI’s abil­ity to model other minds in non-whitelisted de­tail.

Task­ish­ness can be seen as an Align­ment/​Limi­ta­tion hy­brid in the sense that it asks for the AI to only want or try to do a bounded amount at ev­ery level of in­ter­nal or­ga­ni­za­tion.

Low im­pact can be seen as an Align­ment/​Limi­ta­tion hy­brid in the sense that a suc­cess­ful im­pact penalty would make the AI not want to im­ple­ment larger-scale plans.

Limi­ta­tion may be viewed as yet an­other sub­prob­lem of the Hard prob­lem of cor­rigi­bil­ity, since it seems like a type of pre­cau­tion that a generic agent would de­sire to con­struct into a generic im­perfectly-al­igned sub­agent.

Limi­ta­tion can be seen as mo­ti­vated by both the Non-ad­ver­sar­ial prin­ci­ple and the Min­i­mal­ity prin­ci­ple.

Parents:

  • Task-directed AGI

    An ad­vanced AI that’s meant to pur­sue a se­ries of limited-scope goals given it by the user. In Bostrom’s ter­minol­ogy, a Ge­nie.