Nonperson predicate

A “nonperson predicate” is a possible method for preventing an advanced AI from accidentally running sapient computations (it would be a potentially huge moral catastrophe if an AI created, ran, and discarded a large number of sapient programs inside itself). A nonperson predicate looks at potential computations and returns one of two possible answers, “Don’t know” and “Definitely not a person”. A successful nonperson predicate may (very often) return “Don’t know” for computations that aren’t in fact people, but it never returns “Definitely not a person” for something that is a person. In other words, to solve this problem, we don’t need to know what consciousness is so much as we need to know what it isn’t—we don’t need to be sure what is a person, we need to be sure what isn’t a person. For a nonperson predicate to be useful, however, it must still pass enough useful computations that we can build a working, capable AI out of them. (Otherwise “Rocks are okay, everything else might be a person” would be an adequate nonperson predicate.) The foreseeable difficulty of a nonperson predicate is that instrumental pressures to model humans accurately might tend to seek out flaws and loopholes in any attempted predicate. See the page on Mindcrime for more detail.