Central examples

Eliezer Yudkowsky14 Jul 2015 2:49 UTC

Central examples used in Value Alignment Theory:

Paperclip maximizer
instrumental convergence
Gandhian stability & Orthogonality
value not instrumentally convergent
agents
nanotech catastrophe?
programmer manipulation?
astronomical failure
Smile maximizer
Value identification problem
Unforseen maximum
Edge Instantiation
Patch resistance
Treacherous Turn
- Programmer deception
Context Change problem
Complexity of value
AIXI and AIXI-tl
Cartesian boundaries
- problem of ‘wireheading’ the reward signal
methodology of unbounded analysis
what we do and don’t know about AI
agents
Little box in a cellular automaton?
Naturalized induction
Nuclear Prisoner’s Dilemma
timeless decision theory & Newcomblike problems
problem of the blackmail-free equilibrium
division-of-gains problem would need further-expanded matrix
Delta-sigma agents?
tiling agents
ZF provability Oracle
power/safety tradeoff
Notion of a pivotal achievement
Boxing problem
Behaviorist genie
power/safety tradeoff
defeater for some agency and recursion assumptions
That Alien Message
Boxing problem
Cognitive uncontainability
Diamond maximizer
Ontology identification problem

Eliezer Yudkowsky14 Jul 2015 2:49 UTC

Children:

Happiness maximizer
AIXI
How to build an (evil) superintelligent AI using unlimited computing power and one page of Python code.