Nonperson predicate

A “non­per­son pred­i­cate” is a pos­si­ble method for pre­vent­ing an ad­vanced AI from ac­ci­den­tally run­ning sapi­ent com­pu­ta­tions (it would be a po­ten­tially huge moral catas­tro­phe if an AI cre­ated, ran, and dis­carded a large num­ber of sapi­ent pro­grams in­side it­self). A non­per­son pred­i­cate looks at po­ten­tial com­pu­ta­tions and re­turns one of two pos­si­ble an­swers, “Don’t know” and “Definitely not a per­son”. A suc­cess­ful non­per­son pred­i­cate may (very of­ten) re­turn “Don’t know” for com­pu­ta­tions that aren’t in fact peo­ple, but it never re­turns “Definitely not a per­son” for some­thing that is a per­son. In other words, to solve this prob­lem, we don’t need to know what con­scious­ness is so much as we need to know what it isn’t—we don’t need to be sure what is a per­son, we need to be sure what isn’t a per­son. For a non­per­son pred­i­cate to be use­ful, how­ever, it must still pass enough use­ful com­pu­ta­tions that we can build a work­ing, ca­pa­ble AI out of them. (Other­wise “Rocks are okay, ev­ery­thing else might be a per­son” would be an ad­e­quate non­per­son pred­i­cate.) The fore­see­able difficulty of a non­per­son pred­i­cate is that in­stru­men­tal pres­sures to model hu­mans ac­cu­rately might tend to seek out flaws and loop­holes in any at­tempted pred­i­cate. See the page on Mind­crime for more de­tail.

Parents:

  • Mindcrime

    Might a ma­chine in­tel­li­gence con­tain vast num­bers of un­happy con­scious sub­pro­cesses?