Glossary (Value Alignment Theory)

The par­ent page for the defi­ni­tions of words that are given a spe­cial or un­usual mean­ing in­side value al­ign­ment the­ory.

If a word or phrase could be mis­taken for or­di­nary English (e.g. ‘value’ or ‘util­ity func­tion’), then you should cre­ate a glos­sary page in­di­cat­ing its spe­cial mean­ing in­side VAT, and the first use of that word in any po­ten­tially con­fus­ing con­text should be linked here. While other spe­cial phrases (like Value Align­ment The­ory) should also be linked to their con­cept pages for un­der­stand­abil­ity, they do not need glos­sary defi­ni­tions apart from their ex­ist­ing con­cept pages. How­ever, an over­loaded word like ‘value’ needs its own brief (or lengthy) page that can be quickly con­sulted by some­body won­der­ing what that word is taken to mean in the con­text of VAT.

See also Lin­guis­tic con­ven­tions in value al­ign­ment.


  • Friendly AI

    Old ter­minol­ogy for an AI whose prefer­ences have been suc­cess­fully al­igned with ideal­ized hu­man val­ues.

  • Cognitive domain

    An allegedly com­pact unit of knowl­edge, such that ideas in­side the unit in­ter­act mainly with each other and less with ideas in other do­mains.

  • 'Concept'

    In the con­text of Ar­tifi­cial In­tel­li­gence, a ‘con­cept’ is a cat­e­gory, some­thing that iden­ti­fies thin­gies as be­ing in­side or out­side the con­cept.


  • AI alignment

    The great civ­i­liza­tional prob­lem of cre­at­ing ar­tifi­cially in­tel­li­gent com­puter sys­tems such that run­ning them is a good idea.