Researchers in value alignment theory

This page lists re­searchers in AI al­ign­ment.

  • Eliezer Yud­kowsky (founder, MIRI)

  • Nick Bostrom (founder, FHI)

  • Benja Fallen­stein (MIRI; para­met­ric poly­mor­phism, the Pro­cras­ti­na­tion Para­dox, and nu­mer­ous other de­vel­op­ments in Vingean re­flec­tion.)

  • Pa­trick LaVic­toire (MIRI; modal agents)

  • Stu­ar­tArm­strong (FHI; Utility in­differ­ence)

  • Paul Chris­ti­ano (UC Berkeley, ap­proval-di­rected agents, pre­vi­ously pro­posed a for­mal­iza­tion of in­di­rect nor­ma­tivity)

  • Stu­ar­tRus­sell (UC Berkeley; au­thor of Ar­tifi­cial In­tel­li­gence: A Modern Ap­proach; pre­vi­ously pub­lished on the­o­ries of re­flec­tive op­ti­mal­ity; cur­rently in­ter­ested in in­verse re­in­force­ment learn­ing.)

  • Jes­sica Tay­lor (MIRI, re­flec­tive or­a­cles)

  • An­drew Critch (MIRI)

  • Scot­tGara­bant (MIRI, log­i­cal prob­a­bil­ities)

  • Nate Soares (pre­vi­ously MIRI re­searcher, now Ex­ec­u­tive Direc­tor at MIRI)


  • Nick Bostrom

    Nick Bostrom, se­cretly the in­ven­tor of Friendly AI


  • AI alignment

    The great civ­i­liza­tional prob­lem of cre­at­ing ar­tifi­cially in­tel­li­gent com­puter sys­tems such that run­ning them is a good idea.