# Parfit's Hitchhiker

You are stranded in the desert, running out of water, and soon to die. Someone in a motor vehicle drives up to you. The driver of the motor vehicle is a selfish ideally game-theoretical agent, and what’s more, so are you. Furthermore, the driver is Paul Ekman who has spent his whole life studying facial microexpressions and is extremely good at reading people’s honesty by looking at their faces.

The driver says, “Well, as an ideal selfish rational agent, I’ll convey you into town if it’s in my own interest to do so. I don’t want to bother dragging you to Small Claims Court if you don’t pay up. So I’ll just ask you this question: Can you honestly say that you’ll give me $1,000 from an ATM after we reach town?” On some decision theories, an ideal selfish rational agent will realize that once it reaches town, it will have no further incentive to pay the driver. Thus, agents of this type answer “Yes,” whereupon the driver says “You’re lying” and drives off leaving them to die. Would you survive? noteOkay, fine, you’d just keep your promise because of being honest. But would you still survive even if you were an ideal selfish agent running whatever algorithm you consider to correspond to the ideal principle of rational choice? # Analysis Parfit’s Hitchhiker is noteworthy in that, unlike the alien philosopher-troll Omega running strange experiments, Parfit’s driver acts for understandable reasons. The Newcomblike aspect of the problem arises from the way that your algorithm’s output, once inside the city, determines both: • Whether you actually pay up in the city; • Your helpless knowledge of whether you’ll actually pay up in the city, which you can’t stop from being visible in your facial microexpressions. We may assume that Parfit’s driver also asks you questions like “Have you really thought through what you’ll do?” and “Are you trying to think one thing now, knowing that you’ll probably think something else in the city?” and watches your facial expression on those answers as well. Note that quantitative changes in your probability of survival may be worth pursuing, even if you don’t think it’s certain that Paul Ekman could read off your facial expressions correctly. Indeed, just a driver who is fairly good at reading faces might motivate this as an important Newcomblike problem, if you value significant probability shifts in your survival at more than$1,000.

## Logical decision theory

Survives.

• A timeless decision agent, even without the updateless feature, will reason, “If-counterfactually my algorithm for what to do in the city had the logical output ‘refuse to pay’, then in that counterfactual case I would have died in the desert”. The TDT agent will therefore evaluate the expected utility of refusing to pay as very low.

• An updateless decision agent computes that the optimal policy maps the sense data “I can see that I’m already in the city” to the action “Pay the driver \$1,000″ and this computation does not change after the agent sees that it is in the city.

Parents:

• Newcomblike decision problems

Decision problems in which your choice correlates with something other than its physical consequences (say, because somebody has predicted you very well) can do weird things to some decision theories.