'Rationality' of voting in elections
Different decision theories produce different replies about whether it is rational to vote. This divergence occurs because voting is a Newcomblike decision problem, where our own choices about voting correlate with the decisions of other people similar to us.
Suppose that a hundred thousand people are voting in a regional election for candidates Kang and Kodos. Most elections of this size do not end up being decided by a single vote. Let’s say the last election outcome was 50,220 votes for Kang and 50,833 votes for Kodos, which is a very close election as such things go. Is it ‘rational’ to spend an hour researching the candidates and another hour driving to the polling place, in order to vote?
The two standard perspectives on voting in elections can be summarized as:
Very few large elections are decided by a single vote. Therefore, the election winner if you vote is almost certainly identical to the winner if you don’t vote; 50,834 votes for Kodos or 50,221 votes for Kang would not change the outcome. So you shouldn’t expend the time and research costs involved in voting.
Many other people similar to you are deciding whether to vote, or how to vote, based on similar considerations. Your decision probably correlates with their decision. You should consider the costs of all people like you voting, and the consequence of all people like you voting.
Analyses
Pretheoretical
“Yes, I can see that my personal vote only makes one vote’s worth of difference. But are you really saying that in a national election decided by three votes, nobody’s vote made any difference, and in an election decided by one vote for Kodos, everyone who voted for Kodos or everyone who didn’t vote for Kang was singlehandedly responsible for the whole election? I can see telling me that I only carry a tiny fraction of the total responsibility. But if you say that in an election decided by 3 votes, everyone has literally zero responsibility noteOr rather, everyone’s physical vote at the polling place, cast by secret ballot, carries no responsibility. Somebody who campaigned hard enough to swing 1 other vote (among all those irrational voters) would have responsibility for the whole election. , then where do election results even come from?”
Causal decision theory
The principle of rational choice dictates that we make our decisions as follows: Imagine the world as it existed in the moments before our choice. Imagine our choice, but nothing else, changing. Then run the standard rules and laws of physics forward, and see what our choice physically affects. This imagination tells us what we should think is the consequence of choosing that act.
Applying this rule to elections, we arrive at the answer that the result of changing our physical act is to cause one more vote to go to Kang or one more vote for Kodos, which is very unlikely to change the winner of the election. Our choice therefore has little or no effect aside from the costs we expend to go to the polling place. Or maybe it makes you feel a warm glow of having done your civic duty, but it doesn’t counterfactually change the election results. Counterintuitive or not, that’s what the rules say is the rational answer.
Logical decision theory
The principle of rational choice says that when you decide, you are deciding the logical output of your decision algorithm. (And also deciding all other logical facts that are sufficiently tightly correlated, although this broader class of logical consequences is harder to formalize and still under debate.)
In the broad picture, LDT sides with the perspective that “Many people similar to you are deciding whether to vote, and you are effectively deciding whether that whole cohort votes or doesn’t vote, and you should weigh the costs of the whole cohort voting against the consequences of the whole cohort voting.”
The finer details of this picture depend on open questions about how to condition on setting logical facts, when many people are running similar algorithms and no two people are running exactly the same algorithm. Since logical decision theorists are still debating exactly how to formalize the notion of a “logical consequence”, no decisive answer yet exists to these finer questions under LDT.
These open questions include:
Maybe you should regard your decision whether to vote at all or not vote as being correlated with a cohort of people who are making that general decision using reasoning sufficiently similar to yours. You should then see a separate decision who to vote for which correlates with a smaller group of people that are selecting candidates on a similar basis to the basis you use.
Maybe the costs of voting or benefits of voting are quantitatively different among people who are deciding for general reasons similar to yours, and so the central policy decision you’re all making in something like unison corresponds to allowing a particular qualitative step in a decision process. (Or maybe setting a central quantitative threshold plus individual noise, although it’s harder to see how a factor like this could be reasonably extracted from lots of humans deciding whether to vote.)
Very few people will explicitly be taking into account logical decision theory. Perhaps, if you’ve heard of LDT and make decisions on that basis, you are part of such a tiny cohort that you should not bother voting in elections until knowledge of LDT becomes more widespread. Alternatively, since the main advice of logical decision theory is that the pretheoretical perspective is pretty close to correct, maybe you should only regard your LDT knowledge as removing CDT’s obstacle to making a decision in rough unison with the usual pretheoretical grounds.
Voting in an election might still end up being irrational, if you don’t expect any of the elections to be close, or you expect that the pool of people voting for reasons similar to yours is sufficiently small, or that the pool of all small pools has little effect when added on to the election. Or if the candidates are sufficiently similar that the variance in expected consequences seems small, etcetera. (LDT does not say in a blanket way that everyone should vote, but it allows voting to be rational for large numbers of people under plausible circumstances.)
Evidential decision theory
The principle of rational choice dictates that the expected consequence of your decision is the way you would expect the world to be, if you were informed as news that you had actually decided that way. You might vote because, if somebody told you as a fact that you did end up deciding to vote, this would be good news about the number of similar voters to you who would also vote.
But for the same reason that evidential decision theory says to take both boxes in the Transparent Newcomb’s Problem, EDT seems likely to estimate an impact of voting much lower than what LDT estimates. On election day, you’ve already observed many people similar to you voting in previous elections, or seen statistics on how many people like you have voted in previous elections, or watched a friend announce their vote earlier in the day. This is akin to seeing inside the box in the Transparent Newcomb’s Problem, and would screen off the extent to which your own vote this time would be marginal good news about similar people voting.
Parents:
- Newcomblike decision problems
Decision problems in which your choice correlates with something other than its physical consequences (say, because somebody has predicted you very well) can do weird things to some decision theories.
Is there no standard perspective that says:
Very few elections are decided by a single vote, but those that do are sufficiently important that it’s worth voting (especially in close areas)? Naive expected value calculation, which sometimes comes out positive without any need for serious decision-theoretic analysis (because from your perspective, your chance of being the deciding vote is proportional to the size of the system you’re potentially moving)?
If you’re only talking about the case where an election has a clear winner in advance and your vote is, based on your knowledge of the system, extraordinarily unlikely to tip the balance (by enough to outweigh the size of the effect compared to you, which the current example definitely does not do), then I could see discarding that, but it should be addressed or a situation set up to remove it.
I think I once saw either Andrew Gelman or @311 do the “there is an incredibly small chance that you will decide the whole election, expected utility” version of this argument. It could be worth including as a perspective, but would need an accompanying discussion of the division-of-responsibility problem. Imagine the case where it does come down to one vote, with a hundred million people all thinking they individually decided a whole national election… which, if that was actually happening, they should all be willing to spend their whole life savings to do.
hm, do you actually need that discussion? In no case does an agent know in advance that their vote will decide the election, just that there is some (usually extraordinarily slim) chance that they will. A situation where all agents have the impossible piece of information (the election is close enough that my actions can tip it, and, importantly that their tipping won’t be undone by others who are in identical positions) seems not the right situation to be looking at, and would unsurprisingly lead to crazy outputs. Sure, in retrospect all the agents can go “damn, I should’ve put massive effort into acquiring more votes” if the election was close enough that they could have tipped it in a way they expect would have large positive EV, but that seems like a correct and reasonable conclusion in hindsight, just not one which is foreseeable.
EV calc feels like a system I could actually use to weigh up the pros and cons, by looking at the statistics of closeness of various elections and estimating the value of tipping with maybe a few tens of hours of research, whereas estimating the correlation between my voting habits and various possible reference classes of voter seems in practice hopelessnotewithout, perhaps, having enough data to reconstruct key parts of large numbers of people’s decision processes and massive effort classifying them, at which point you’re not really running a process other people are likely to (unless you make your results publicly available, and things get recursive!).
Maybe explaining this is more of a detour than you want, though, since it’s less interesting from a decision theory perspective?