# Transparent Newcomb's Problem

Like Newcomb’s Problem, but Box B is *also* transparent. That is:

Omega has presented you with the following dilemma:

There are two boxes before you, Box A and Box B.

You can either take both boxes (“two-box”), or take only Box B (“one-box”).

Box A is transparent and contains $1,000.

Box B is

*also*transparent and contains either $1,000,000 or $0.Omega has already put $1,000,000 into Box B

*if and only if*Omega**predicts that you will one-box when faced with a visibly full Box B.**Omega has been right in a couple of dozen games so far, but not a thousand games, and Omega

*could*be wrong next time given our current knowledge. We may alternatively suppose that Omega is right 99%, but not 99.9%, of the time. noteThat is, Omega’s success rate reflects that everyone who’s seen a full Box B has one-boxed. Some people who’ve seen an empty Box B have been indignant about that. But based on Omega’s accuracy in the testable cases, they’re probably wrong about what they would have done.

This Newcomblike dilemma is structurally similar to Parfit’s Hitchhiker (no decision theory disputes this structural similarity, so far as we know).

Note that it is not, in general, possible to have a transparent Newcomb’s Problem in which, for every possible agent, Omega fills Box B iff Omega predicts unconditionally that the agent ends up one-boxing. Some agent could two-box on seeing a full Box B and one-box on seeing an empty Box B, making the general rule impossible for Omega to fulfill.

Similarly, the problem setup stipulates that it seems not entirely impossible that Omega will get the prediction wrong next time. Otherwise this would introduce a new and distracting problem of conditioning on a visible impossibility when we see a full Box B and consider two-boxing.

# Analyses

## Causal decision theory

Two-boxes, because one-boxing cannot cause Box B to be full or empty, since Omega has already predicted and departed.

## Evidential decision theory

Two-boxes, because one-boxing cannot be further *good news* about Box B being full, because the agent has already seen that Box B is full. The agent, upon imagining being told that it one-boxes here, imagines concluding “Omega made its first mistake!” rather than “My eyes are deceiving me and Box B is actually empty.” (Thus, EDT agents never see a full Box B to begin with.)

## Logical decision theory

One-boxes, because:

• On timeless decision theory without the updateless feature: Even after observing Box B being full, we conclude from our extended causal model that in the *counterfactual* case where our algorithm output “Take both boxes”, Box B would have *counterfactually* been empty. (Updateful TDT does not in general output the behavior corresponding to the highest score on problems in this decision class, but updateful TDT happens to get the highest score in this particular scenario.)

• On updateless decision theories: The policy of mapping the sensory input “Box B is full” onto the action “Take only one box” leads to the highest expected utility (as evaluated relative to our non-updated prior).

The Transparent Newcomb’s Problem is significant because it counterargues a widespread view that EDT and CDT split the Newcomblike problems between them, with EDT being the decision theory that accepts ‘why aincha rich?’ arguments.

EDT and CDT agree on two-boxing in the Transparent Newcomb’s Problem, both saying, “Omega has chosen to penalize the rational behavior here, alas, but it is too late for me to do anything about that.”

LDT disagrees with both and one-boxes, saying “My algorithm can output whatever behavior I want.”

EDT and CDT agents exhibit the behavior pattern that corresponds to being poor; LDT agents ask, “If your principle of choice is so rational, why aincha rich?”

## Truly clever LDT and EDT agents

Truly clever agents will realize that the (transparently visible) state of Box B reflects oracular reasoning by Omega about any factor that could affect our decision whether to one-box after seeing a full Box B. The value of an advance prediction about any possible observable factor determining our decision could easily exceed a million dollars.

For example, suppose we have until the end of the day to actually decide how many boxes to take. On finding yourself in a transparent Newcomb’s Problem, you could postcommit to an obvious strategy such as that you’ll one-box iff the S&P 500 ends up on the day. If you see Box B is full, you can load up on margin and buy short-term call options (and then wait, and actually one-box at the end of the day iff the S&P 500 goes up).

You could also carry out the converse strategy (buy put options if you see Box B is empty), but only if you’re confident that the S&P 500′s daily movement is independent of any options you buy and that both of your possible selves converge on the same postcommitment, since what you’re learning from seeing Box B in this case is what your action *would* have been at the end of the day *if* Box B had been full.

This general strategy was observed by Eliezer Yudkowsky and Jack LaSota.

Parents:

- Newcomblike decision problems
Decision problems in which your choice correlates with something other than its physical consequences (say, because somebody has predicted you very well) can do weird things to some decision theories.