Central examples
Central examples used in Value Alignment Theory:
instrumental convergence
Gandhian stability & Orthogonality
value not instrumentally convergent
agents
nanotech catastrophe?
programmer manipulation?
astronomical failure
Smile maximizer
Treacherous Turn
AIXI and AIXI-tl
Cartesian boundaries
problem of ‘wireheading’ the reward signal
methodology of unbounded analysis
what we do and don’t know about AI
agents
Little box in a cellular automaton?
Naturalized induction
Nuclear Prisoner’s Dilemma
timeless decision theory & Newcomblike problems
problem of the blackmail-free equilibrium
division-of-gains problem would need further-expanded matrix
Delta-sigma agents?
tiling agents
power/safety tradeoff
power/safety tradeoff
defeater for some agency and recursion assumptions
Children:
- Happiness maximizer
- AIXI
How to build an (evil) superintelligent AI using unlimited computing power and one page of Python code.