# Two independent events: Square visualization

$$\newcommand{\true}{\text{True}} \newcommand{\false}{\text{False}} \newcommand{\bP}{\mathbb{P}}$$

This is what in­de­pen­dence looks like, us­ing the square vi­su­al­iza­tion of prob­a­bil­ities: We can see that the events $$A$$ and $$B$$ don’t in­ter­act; we say that $$A$$ and $$B$$ are in­de­pen­dent. Whether we look at the whole square, or just the red part of the square where $$A$$ is true, the prob­a­bil­ity of $$B$$ stays the same. In other words, $$\bP(B \mid A) = \bP(B)$$. That’s what we mean by in­de­pen­dence: the prob­a­bil­ity of $$B$$ doesn’t change if you con­di­tion on $$A$$.

Our square of prob­a­bil­ities can be gen­er­ated by mul­ti­ply­ing to­gether the prob­a­bil­ity of $$A$$ and the prob­a­bil­ity of $$B$$: This pic­ture demon­strates an­other way to define what it means for $$A$$ and $$B$$ to be in­de­pen­dent:

$$\oC(N, O) = \oC(N)\oC(O)\ .$$

## In terms of fac­tor­ing a joint distribution

Let’s con­trast in­de­pen­dence with non-in­de­pen­dence. Here’s a pic­ture of two or­di­nary, non-in­de­pen­dent events $$A$$ and $$B$$: (If the mean­ing of this pic­ture isn’t clear, take a look at Square vi­su­al­iza­tion of prob­a­bil­ities on two events.)

We have the red blocks for $$\bP(A)$$ and the blue blocks for $$\bP(\neg A)$$ lined up in columns. This means we’ve fac­tored our prob­a­bil­ity dis­tri­bu­tion us­ing $$A$$ as the first fac­tor:

$$\bP(A,B) = \bP(A) \bP(B \mid A)\ .$$

We could just as well have fac­tored by $$B$$ first: $$\bP(A,B) = \bP(B) \bP( A \mid B)\ .$$ Then we’d draw a pic­ture like this: Now, here again is the pic­ture of two in­de­pen­dent events $$A$$ and $$B$$: In this pic­ture, there’s red and blue lined-up columns for $$\bP(A)$$ and $$\bP(\neg A)$$, and there’s also dark and light lined-up rows for $$\bP(B)$$ and $$\bP(\neg B)$$. It looks like we some­how fac­tored our prob­a­bil­ity dis­tri­bu­tion $$\bP$$ us­ing both $$A$$ and $$B$$ as the first fac­tor.

In fact, this is ex­actly what hap­pened: since $$A$$ and $$B$$ are in­de­pen­dent, we have that $$\bP(B \mid A) = \bP(B)$$. So the di­a­gram above is ac­tu­ally fac­tored ac­cord­ing to $$A$$ first: $$\bP(A,B) = \bP(A) \bP(B \mid A)$$. It’s just that $$\bP(B \mid A)= \bP(B) = \bP(B \mid \neg A)$$, since $$B$$ is in­de­pen­dent from $$A$$. So we don’t need to have differ­ent ra­tios of dark to light (a.k.a. con­di­tional prob­a­bil­ities of $$B$$) in the left and right columns: In this vi­su­al­iza­tion, we can see what hap­pens to the prob­a­bil­ity of $$B$$ when you con­di­tion on $$A$$ or on $$\neg A$$: it doesn’t change at all. The ra­tio of [the area where $$B$$ hap­pens] to [the whole area], is the same as the ra­tio $$\bP(B \mid A)$$ where we only look at the area where $$A$$ hap­pens, which is the same as the ra­tio $$\bP(B \mid \neg A)$$ where we only look at the area where $$\neg A$$ hap­pens. The fact that the prob­a­bil­ity of $$B$$ doesn’t change when we con­di­tion on $$A$$ is ex­actly what we mean when we say that $$A$$ and $$B$$ are in­de­pen­dent.

The square di­a­gram above is also fac­tored ac­cord­ing to $$B$$ first, us­ing $$\bP(A,B) = \bP(B) \bP(A \mid B)$$. The red /​ blue ra­tios are the same in both rows be­cause $$\bP(A \mid B) = \bP(A) = \bP(A \mid \neg B)$$, since $$A$$ and $$B$$ are in­de­pen­dent: We couldn’t do any of this stuff if the columns and rows didn’t both line up. (Which is good, be­cause then we’d have proved the false state­ment that any two events are in­de­pen­dent!)

## In terms of mul­ti­ply­ing marginal probabilities

Another way to say that $$A$$ and $$B$$ are in­de­pen­dent vari­ables noteWe’re us­ing the equiv­alence be­tween event prob­a­bil­ity events and bi­nary vari­ables. is that for any truth val­ues $$t_A,t_B \in \{\true, \false\},$$

$$\oC(N = g_N, O= g_O) = \oC(N = g_N)\oC(O = g_O)\ .$$

So the joint prob­a­bil­ities for $$A$$ and $$B$$ are com­puted by sep­a­rately get­ting the prob­a­bil­ity of $$A$$ and the prob­a­bil­ity of $$B$$, and then mul­ti­ply­ing the two prob­a­bil­ities to­gether. For ex­am­ple, say we want to com­pute the prob­a­bil­ity $$\bP(A, \neg B) = \bP(A = \true, B = \false)$$. We start with the marginal prob­a­bil­ity of $$A$$: and the prob­a­bil­ity of $$\neg B$$: and then we mul­ti­ply them: We can get all the joint prob­a­bil­ities this way. So we can vi­su­al­ize the whole joint dis­tri­bu­tion as the thing that you get when you mul­ti­ply two in­de­pen­dent prob­a­bil­ity dis­tri­bu­tions to­gether. We just over­lay the two dis­tri­bu­tions: To be a lit­tle more math­e­mat­i­cally el­e­gant, we’d use the topolog­i­cal product of two spaces shown ear­lier to draw the joint dis­tri­bu­tion as a product of the dis­tri­bu­tions of $$A$$ and $$B$$: Parents:

• Two independent events

What do pair of dice, pair of coins, and pair of peo­ple on op­po­site sides of the planet all have in com­mon?