# Coherence theorems

*A tutorial introducing this concept exists here.*

In the context of decision theory, “coherence theorems” are theorems saying that an agent’s beliefs or behavior must be viewable as consistent in way X, or else penalty Y happens.

E.g., suppose we’re talking about an agent’s preference in pizza toppings. Let us say an agent locally prefers pizza topping A over pizza topping B if, offered a choice between a slice of A pizza and a slice of B pizza, the agent takes the A pizza.

Then suppose some agent:

Locally prefers onion pizza over pineapple pizza.

Locally prefers pineapple pizza over mushroom pizza.

Locally prefers mushroom pizza over onion pizza.

Suppose also that at least the first preference is strong enough that, e.g. the agent would pay one penny to switch from pineapple pizza to onion pizza. noteWe can also say, e.g., that the agent would spend some small amount of time and effort to reach out and change pizza slices, even if this did not directly involve spending money.

Then we can, e.g.:

Start by offering the agent pineapple pizza;

Collect one penny from the agent to switch their option to onion pizza;

Offer the agent a free switch from onion to mushroom;

Finally, offer them a slice of pineapple pizza instead of the mushroom pizza.

Now the agent has the same pineapple pizza slice it started with, and is strictly one penny poorer. This is a *qualitatively* dominated strategy—the agent could have pursued a better strategy that would end with the same slice of pineapple pizza plus one penny. noteOr without having lost whatever other opportunity costs, including the simple expenditure of time, it sacrificed to change pizza slices.

Or as Steve Omohundro put it: If you prefer being in San Francisco to being in San Jose, prefer being in Oakland to being in San Francisco, and prefer being in San Jose to being in Oakland, you’re going to waste a lot of money on taxi rides.

This in turn suggests that we might be able to prove some kind of theorem saying, “If we can’t view the agent’s behavior as being coherent with some *consistent global preference ordering,* the agent must be executing dominated strategies.”

Broadly speaking, this is the sort of thing that coherence theorems say. Although nailing down caveats and generalizing to continuous spaces etcetera, often makes the standard proofs a lot more complicated than the above argument suggests.

Another class of coherence theorems says, “If some aspect of your decisions or beliefs has coherence properties X, we can map it onto mathematical structure Y.” For example, other coherence theorems show that we can go from alternative representations of belief and credence, like log odds, to the standard form of probabilities, given assumptions like “If the agent sees piece of evidence A and then piece of evidence B, the agent’s final belief state is the same as if it sees evidence B and then evidence A.”

Coherence theorems generally point at consistent utility functions, consistent probability assignments, local decisions consistent with expected utility, or belief updates consistent with Bayes’s Rule.

Since relaxing the assumptions used in a coherence theorem is an improvement on that theorem (and hence good for a publication), the total family of coherence theorems is rather large and very technical.

Coherence theorems are relevant because, e.g.:

If we are trying to figure out an optimal strategy for some problem, we’re justified in saying that

*any*optimal strategy ought to let us say how much we like different possible outcomes and what we believe about our chances of getting there.If we’re dealing with a very advanced AI; and whatever process was responsible for the AI getting that cognitively powerful in the first place, has ironed out all the shooting-off-your-own-foot running-in-circles behaviors visible to us humans; then

*so far as we can tell*, the AI will probably*look to us*like it is behaving in a way coherent with it having a consistent utility function and probabilistic beliefs.

# Extremely incomplete list of some coherence theorems in decision theory

(somebody fill this out more, please)

Wald’s complete class theorem: Given a set of possible worlds, a quantitative utility function on outcomes, and an agent receiving observations that rule out subsets of those possible worlds, every non-dominated strategy for taking different actions conditional on observations can be viewed as the agent starting with a consistent prior on the set of possible worlds and executing Bayesian updates.

Von Neumann-Morgenstern utility theorem: If an agent’s choice function over uncertain states of the world is complete, transitive, continuous in probabilities, and doesn’t change when we add to every gamble the same probability of some alternative outcome, that agent’s choices are consistent with taking the expectation of some utility function.

Cox’s Theorem (and variants with weaker assumptions): If updating on evidence A, then evidence B, leads to the same belief state as updating on evidence B, then evidence A, plus some other stuff, we can map your belief states onto classical probabilities. (E.g., if you happen to represent all your beliefs in Odds, but your beliefs still obey coherence properties like believing the same thing regardless of the order in which you viewed the evidence, there is a variant of Cox’s Theorem which will construct a mapping from your odds to classical probabilities.)

Dutch book arguments: If the odds at which you accept or reject bets aren’t consistent with standard probabilities, you will accept combinations of bets that lead to certain losses or reject combinations of bets that lead to certain gains.

Parents:

- Expected utility formalism
Expected utility is the central idea in the quantitative implementation of consequentialism