# Probability

*Probabilities* are the central subject of the discipline of Probability theory. \(\mathbb{P}(X)\) denotes our level of belief, or someone’s level of belief, that the proposition \(X\) is true. In the classical and canonical representation of probability, 0 expresses absolute incredulity, and 1 expresses absolute credulity.

For the standard probability axioms, see https://en.wikipedia.org/wiki/Probability_axioms. write up page on arbital about probability axioms.

# Notation

\(\mathbb{P}(X)\) is the probability that X is true.

\(\mathbb{P}(\neg X) = 1 - \mathbb{P}(X)\) is the probability that X is false.

\(\mathbb{P}(X \wedge Y)\) is the probability that both X and Y are true.

\(\mathbb{P}(X \vee Y)\) is the probability that X or Y or both are true.

\(\mathbb{P}(X|Y) := \frac{\mathbb{P}(X \wedge Y}{\mathbb{P}(Y)}\) is the **conditional probability of X given Y.** That is, \(\mathbb{P}(X|Y)\) is **the degree to which we would believe X, assuming Y to be true.** \(\mathbb{P}(yellow|banana)\) expresses “The probability that a banana is yellow.” \(\mathbb{P}(banana|yellow)\) expresses “The probability that a yellow thing is a banana”.

# Centrality of the classical representation

While there are other ways of expressing quantitative degrees of belief, such as odds ratios, there are several especially useful properties or roles of classical probabilities that give them a central / convergent / canonical status among possible ways of representing credence.

Odds ratios are isomorphic to probabilities—we can readily go back and forth between a probability of 20%, and odds of 1:4. But unlike odds ratios, probabilities have the further appealing property of being able to add the probabilities of two mutually exclusive possibilities to arrive at the probability that one of them occurs. The ^{1}⁄_{6} probability of a six-sided die turning up 1, plus the ^{1}⁄_{6} probability of a die turning up 2, equals the ^{1}⁄_{3} probability that the die turns up 1 or 2. The odds ratios 1:5, 1:5, and 1:2 don’t have this direct relation (though we could convert to probabilities, add, and then convert back to odds ratios).

Thus, classical probabilities are uniquely the quantities that must appear in the expected utilities to weigh how much we proportionally care about the uncertain consequences of our decisions. When an outcome has classical probability ^{1}⁄_{3}, we multiply the degree to which we care by a factor of ^{1}⁄_{3}, not by, e.g., the odds ratio 1:2.

If the amount you’d pay for a lottery ticket that paid out on 1 or 2 was more or less than twice the price you paid for a lottery ticket that only paid out on 1, or a lottery ticket that paid out on 2, then I could buy from you and sell to you a combination of lottery tickets such that you would end up with a certain loss. This is an example of a Dutch book argument, which is one kind of coherence theorem that underpins classical probability and its role in choice. (If we were dealing with actual betting and gambling, you might reply that you’d just refuse to bet on disadvantageous combinations; but in the much larger gamble that is life, “doing nothing” is just one more choice with an uncertain, probabilistic payoff.)

The combination of several such coherence theorems, most notably including the Dutch Book arguments, Cox’s Theorem and its variations for probability theory, and the Von Neumann-Morgenstern theorem (VNM) and its variations for expected utility, together give the classical probabilities between 0 and 1 a *central* status in the theory of epistemic and instrumental rationality. Other ways of representing scalar probabilities, or alternatives to scalar probability, would need to be converted or munged back into classical probabilities in order to animate agents making coherent choices.

This also suggests that bounded agents which *approximate* coherence, or at least manage to avoid blatantly self-destructive violations of coherence, might have internal mental states which can be *approximately* viewed as corresponding to classical probabilities. Perhaps not in terms of such agents necessarily containing floating-point numbers that directly represent those probabilities internally, but at least in terms of our being able to look over the agent’s behavior and deduce that they were “behaving as if” they had assigned some coherent classical probability.

Children:

- Joint probability
The notation for writing the chance that both X and Y are true.

- Conditional probability
The notation for writing “The probability that someone has green eyes, if we know that they have red hair.”

- Interpretations of "probability"
What does it

*mean*to say that a fair coin has a 50% probability of coming up heads? - Report likelihoods, not p-values

Parents:

- Probability theory
The logic of science; coherence relations on quantitative degrees of belief.