Faithful simulation

The safe simu­la­tion prob­lem is to start with some dy­nam­i­cal phys­i­cal pro­cess \(D\) which would, if run long enough in some speci­fied en­vi­ron­ment, pro­duce some trust­wor­thy in­for­ma­tion of great value, and to com­pute some ad­e­quate simu­la­tion \(S_D\) of \(D\) faster than the phys­i­cal pro­cess could have run. In this con­text, the term “ad­e­quate” is value-laden—it means that what­ever we would use \(D\) for, us­ing \(S_D\) in­stead pro­duces within ep­silon of the ex­pected value we could have got­ten from us­ing the real \(D.\) In more con­crete terms, for ex­am­ple, we might want to tell a Task AGI “up­load this hu­man and run them as a simu­la­tion”, and we don’t want some tiny sys­tem­atic skew in how the Task AGI mod­els sero­tonin to turn the hu­man into a psy­chopath, which is a bad (value-de­stroy­ing) simu­la­tion fault. Perfect simu­la­tion will be out of the ques­tion; the brain is al­most cer­tainly a chaotic sys­tem and hence we can’t hope to pro­duce ex­actly the same re­sult as a biolog­i­cal brain. The ques­tion, then, is what kind not-ex­actly-the-same-re­sult the simu­la­tion is al­lowed to pro­duce.

As with “low im­pact” hope­fully be­ing lower-com­plex­ity than “low bad im­pact”, we might hope to get an ad­e­quate simu­la­tion via some no­tion of faith­ful simu­la­tion, which rules out bumps in sero­tonin that turn the up­load into a psy­chopath, while pos­si­bly also rul­ing out any num­ber of other changes we wouldn’t see as im­por­tant; with this no­tion of “faith­ful­ness” still be­ing per­mis­sive enough to al­low the simu­la­tion to take place at a level above in­di­vi­d­ual quarks. On what­ever com­put­ing power is available—pos­si­bly nanocom­put­ers, if the brain was scanned via molec­u­lar nan­otech­nol­ogy—the up­load must be runnable fast enough to make the simu­la­tion task worth­while.

Since the main use for the no­tion of “faith­ful simu­la­tion” cur­rently ap­pears to be iden­ti­fy­ing a safe plan for up­load­ing one or more hu­mans as a pivotal act, we might also con­sider this prob­lem in con­junc­tion with the spe­cial case of want­ing to avoid mind­crime. In other words, we’d like a crite­rion of faith­ful simu­la­tion which the AGI can com­pute with­out it need­ing to ob­serve mil­lions of hy­po­thet­i­cal simu­lated brains for ten sec­onds apiece, which could con­sti­tute cre­at­ing mil­lions of peo­ple and kil­ling them ten sec­onds later. We’d much pre­fer, e.g., a crite­rion of faith­ful simu­la­tion of in­di­vi­d­ual neu­rons and synapses be­tween them up to the level of, say, two in­ter­act­ing cor­ti­cal columns, such that we could be con­fi­dent that in ag­gre­gate the faith­ful simu­la­tion of the neu­rons would cor­re­spond to the faith­ful simu­la­tion of whole hu­man brains. This way the AGI would not need to think about or simu­late whole brains in or­der to ver­ify that an up­load­ing pro­ce­dure would pro­duce a faith­ful simu­la­tion, and mind­crime could be avoided.

Note that the no­tion of a “func­tional prop­erty” of the brain—see­ing the neu­rons as com­put­ing some­thing im­por­tant, and not want­ing to dis­turb the com­pu­ta­tion—is still value-laden. It in­volves re­gard­ing the brain as a means to a com­pu­ta­tional end, and what we see as the im­por­tant com­pu­ta­tional end is value-laden, given that chaos guaran­tees the in­put-out­put re­la­tion won’t be ex­actly the same. The brain can equally be seen as im­plic­itly com­put­ing, say, the par­ity of the num­ber of synapse ac­ti­va­tions; it’s just that we don’t see this func­tional prop­erty as a valuable one that we want to pre­serve.

To the ex­tent that some no­tion of func­tion might be in­voked in a no­tion of faith­ful, per­mit­ted speedups, we should hope that rather than need­ing the AGI to un­der­stand the high-level func­tional prop­er­ties of the brain and which de­tails we thought were too im­por­tant to sim­plify, it might be enough to un­der­stand a ‘func­tional’ model of in­di­vi­d­ual neu­rons and synapses, with the re­sult­ing trans­form of the up­loaded brain still al­low­ing for a pivotal speedup and know­ably-faith­ful simu­la­tion of the larger brain.

At the same time, strictly lo­cal mea­sures of faith­ful­ness seem prob­le­matic if they can con­ceal sys­tem­atic larger di­ver­gences. We might think that any per­tur­ba­tion of a simu­lated neu­ron which has as lit­tle effect as adding one phonon is “within ther­mal un­cer­tainty” and there­fore unim­por­tant, but if all of these per­tur­ba­tions are point­ing in the same di­rec­tion rel­a­tive to some larger func­tional prop­erty, the differ­ence might be very sig­nifi­cant. Similarly if all simu­lated synapses re­leased slightly more sero­tonin, rather than re­leas­ing slightly more or less sero­tonin in no par­tic­u­lar sys­tem­atic pat­tern.


  • Task-directed AGI

    An ad­vanced AI that’s meant to pur­sue a se­ries of limited-scope goals given it by the user. In Bostrom’s ter­minol­ogy, a Ge­nie.