Module 03

Conditional probability and independence


TC and DJM

Last modified — 09 Jan 2026

1 Conditional probability

Conditional probability

  • In general, the outcome of a random experiment can be any element of \(\Omega\).

  • Sometimes, we have “partial information” about which elements can occur.

Example
  • Roll a die.

  • If \(A\) is the event of obtaining a “2”, then \(\mathbb{P}(A) = 1/6\).

  • But if the outcome is known to be even, then intuition suggests that \(\mathbb{P}(A) > 1/6\).

  • Conditional probability formalizes this intuition (and helps to avoid mistakes)

Conditional probability

  • Two events play distinct roles in this example:

  • The event of interest \(A = \{ 2 \}\)

  • The conditioning event \[B= \{\text{outcome is even}\} = \{2, 4, 6\}\]

  • The conditioning event captures the “partial information”

Conditional probability, formal definition

  • Let \(A, B \subseteq \Omega\) and assume \(\mathbb{P}(B) > 0\)
Definition
  • The conditional probability of \(A\) given \(B\) is \[\mathbb{P}\left(A \ \vert\ B \right) \, = \, \frac{ \mathbb{P}\left( A \cap B \right) }{ \mathbb{P}\left( B \right) }\]
  • Just as \(\mathbb{P}(\cdot)\) is a function, for any fixed \(B\), \[\mathbb{Q}_B \left( \ \cdot \ \right) \, = \, \mathbb{P}\left(\ \cdot \ \ \vert\ B \right)\] is a function. Its argument is any event \(A \subseteq \Omega\).

  • Moreover, \(\mathbb{P}\left(\ \cdot \ \ \vert\ B \right)\) satisfies the three Axioms of a Probability.

\(\mathbb{P}(\cdot | B)\) is a probability

Exercise 1
Prove that \(\mathbb{P}(\cdot | B)\) is a probability.

Formalizing our intuition

Your friend rolls a fair die once, looks at it, and tells you that the result is even.

What is \(\mathbb{P}(\{2\} \ \vert\ \text{even})\)?

  • The event of interest \(A = \{ 2 \}\)

  • The conditioning event \(B= \{\text{outcome is even}\} = \{2, 4, 6\}\)

  • Therefore, \[ \mathbb{P}(A \ \vert\ B) = \frac{\mathbb{P}(A \cap B)}{\mathbb{P}(B)} = \frac{1/6}{1/2} = 1/3 > 1/6 = \mathbb{P}(A). \]

Multiplication property

If \(\mathbb{P}(A_1) > 0\), then \[ \mathbb{P}\left( A_{1} \cap A_{2}\right) = \mathbb{P}\left(A_{2}\ \vert\ A_{1}\right) \, \mathbb{P}\left( A_{1}\right). \]

Corollary
If \(\mathbb{P}(A_1),\ \mathbb{P}(A_1 \cap A_2),\dots,\ P(A_1 \cap A_2 \cap \dots \cap A_{n-1}) > 0\), then \[\begin{aligned} \mathbb{P}\left( A_{1}\cap A_{2}\cap \cdots \cap A_{n}\right) &= \mathbb{P}\left( A_{n}\ \vert\ A_{1}\cap A_{2}\cap \cdots \cap A_{n-1}\right) \\ &\quad \times \mathbb{P}\left( A_{n-1}\ \vert\ A_{1}\cap A_{2}\cap \cdots \cap A_{n-2}\right) \\ & \quad \times \cdots \times\\ & \quad \times \mathbb{P}\left( A_{3}\ \vert\ A_{1}\cap A_{2}\right) \times \mathbb{P}\left( A_{2}\ \vert\ A_{1}\right) \times \mathbb{P}\left( A_{1}\right) \end{aligned}\]

Proof of multiplication property

Proof
To fix ideas, look at the case \(n=4\) \[\begin{aligned} & \mathbb{P}(A_{1}) \mathbb{P}(A_{2} \ \vert\ A_{1}) \mathbb{P}(A_{3} \ \vert\ A_{1} \cap A_{2}) \mathbb{P}(A_{4} \ \vert\ A_{1} \cap A_{2} \cap A_{3}) \\ &= \mathbb{P}(A_{1}) \frac{\mathbb{P}(A_{1} \cap A_{2})} {\mathbb{P}(A_{1})} \frac{\mathbb{P}(A_{1} \cap A_{2} \cap A_{3})} {\mathbb{P}(A_{1} \cap A_{2})} \frac{\mathbb{P}(A_{1} \cap A_{2} \cap A_{3}\cap A_{4})} {\mathbb{P}(A_{1} \cap A_{2} \cap A_{3})}\\ &=\mathbb{P}(A_{1} \cap A_{2} \cap A_{3} \cap A_{4}). \end{aligned}\]

The proof is the same for any \(n\). Everything cancels except what we want.

Urns and Balls

Exercise 2
  • An urn has 10 red balls and 40 black balls.

  • Three balls are randomly drawn without replacement.

Calculate the probability that:

  1. The first drawn ball is red, the 2nd is black and the 3rd is red.

  2. The 3rd ball is red given that the 1st is red and the 2nd is black.

“Total probability” formula

Definition

We say that \(B_{1}, \ldots, B_{n}\) is a partition of \(\Omega\) if

  1. They are disjoint \(B_{i}\cap B_{j} \, = \, \varnothing \quad \mbox{ for } i \ne j \, ,\)

  2. They cover the whole sample space: \(\bigcup_{i=1}^{n} B_{i} \, = \, \Omega\)

Theorem
If \(B_{1}, \ldots, B_{n}\) is a partition of \(\Omega\), then, for any \(A \in \Omega\), \[\mathbb{P}\left( A\right) =\sum_{i=1}^{n} \mathbb{P}\left( A\ \vert\ B_{i}\right) \, \mathbb{P}\left( B_{i}\right).\]

Proof of Total Probability

Proof
  • \(A = A \cap \Omega = A \cap \left( \bigcup _{i=1}^{n}B_{i}\right) = \bigcup_{i=1}^{n}\left( A \cap B_{i}\right)\)

  • The events \(\left( A \cap B_{i}\right)\) are disjoint.

  • Therefore, by Axiom 3, we have \[\begin{aligned} \mathbb{P}\left( A\right) & = \mathbb{P}\left( \bigcup_{i=1}^{n} A \cap B_{i} \right) \\ &=\sum_{i=1}^{n} \mathbb{P}\left( A\cap B_{i}\right) \\ &=\sum_{i=1}^{n} \mathbb{P}\left( A\ \vert\ B_{i}\right) \, \mathbb{P}\left( B_{i}\right). \end{aligned}\]

Flu tests

  • Suppose that every patient who visits the ER is given a flu test.
  • Suppose that 30% of patients have flu.
  • The test is known to have 80% specificity: a patient with flu tests positive 80% of the time.
  • The test is known to have 80% sensitivity: a patient without flu tests negative 80% of the time.

Exercise 3
What percentage of patients test positive for flu?

Bayes formula

Theorem
If \(B_{1}\), \(B_{2}\), …, \(B_{n}\) is a partition of \(\Omega\), then for each \(i=1,\dots,n\), we have \[\mathbb{P}\left( B_{i}\ \vert\ A\right) \, = \, \frac{\mathbb{P}\left( A\ \vert\ B_{i}\right) \, \mathbb{P}\left( B_{i}\right) }{% \sum_{j=1}^{n} \mathbb{P}\left( A\ \vert\ B_{j}\right) \, \mathbb{P}\left( B_{j}\right) }\]

Proof
\[\begin{aligned} \mathbb{P}\left( B_{i}\ \vert\ A\right) &=\frac{\mathbb{P}\left( A\cap B_{i}\right)}{\mathbb{P}\left( A\right) } & \text{(Definition of conditional prob)} \\ &=\frac{ \mathbb{P}\left( A\ \vert\ B_{i}\right) \, \mathbb{P}\left( B_{i}\right)}{\mathbb{P}\left( A\right) } & \text{(Multiplication Rule)} \\ &=\frac{\mathbb{P}\left( A\ \vert\ B_{i}\right) \, \mathbb{P}\left( B_{i}\right) }{\sum_{j=1}^{n} \mathbb{P}\left( A\ \vert\ B_{j}\right) \, \mathbb{P}\left( B_{j}\right) } & \text{(Rule of Total Prob)} \end{aligned} \]

Flu prevalence

  • Suppose that every patient who visits the ER is given a flu test.
  • Suppose that 30% of tests are positive.
  • The test is known to have 80% specificity: a patient with flu tests positive 80% of the time.
  • The test is known to have 80% sensitivity: a patient without flu tests negative 80% of the time.
Exercise 4

Suppose you bring a friend to the ER.

  1. What is the prevalence of flu?
  2. What is the probability that your friend has flu if they test positive?
  3. What is the probability that your friend has flu if they test negative?
  4. What is the probability that a test is wrong?

The Monty Hall Problem

  • You are a contestant on a game show. In front of you are three doors.
  • Behind two doors are goats. 🐐
  • Behind one door is a car. 🚗

You select a door, the host then opens one of the 2 remaining doors, revealing a 🐐. The host asks

Would you like to switch to the remaining closed door?

The Monty Hall Problem

Exercise 5
Show that the probability of winning the car if you switch doors is 2/3.

2 Independence

Independence

Definition
We say that events \(A\) and \(B\) are independent if \[\mathbb{P}\left( A\cap B\right) \, = \, \mathbb{P}\left( A \right) \, \mathbb{P}\left( B\right).\]

Theorem
If \(\mathbb{P}\left( B\right) >0\),

\[\mathbb{P}\left( A\ \vert\ B\right) = \frac{\mathbb{P}\left( A\cap B\right) }{\mathbb{P}\left( B\right) } = \frac{\mathbb{P}\left( A\right) \mathbb{P}\left( B\right) }{\mathbb{P}\left( B\right) } = \mathbb{P}\left( A\right).\]

  • Knowledge about \(B\) occurring does not change the probability of \(A\) and vice versa.

  • Knowledge of the occurrence of either of these events does not affect the probability of the other.

  • Thus the name: “independent events”

Independence and trivial probabilities

Definition
We say that an event \(A\) is non-trivial if \(0<P\left( A\right) <1\).

Theorem

If \(A\) and \(B\) are non-trivial events. Then,

  1. If \(A\cap B=\varnothing\) then \(A\) and \(B\) are not independent
  2. If \(A\subset B\) then \(A\) and \(B\) are not independent.

Exercise 6
Prove the theorem.

Independence and complements

Exercise 7
  1. If \(A\) and \(B\) are independent then so are \(A^{c}\) and \(B\).
  2. If \(A\) and \(B\) are independent then so are \(A\) and \(B^{c}\).
  3. if \(A\) and \(B\) are independent then so are \(A^{c}\) and \(B^{c}\)

Independence depends on \(\mathbb{P}\)

  • Let \(\Omega =\left\{ 1, 2, 3, 4, 5, 6, 7, 8 \right\}\)
  • Let \(A=\{1, 2, 3, 4\}\) and \(B=\left\{ 4, 8\right\}\)

Case 1

If \(\mathbb{P}(\{i\}) = 1/8 \qquad \forall i\), then

\[\begin{aligned} \mathbb{P}\left( A\cap B\right) &= \mathbb{P}\left( \left\{ 4 \right\} \right) = 1/8, \qquad \text{ and } \qquad \mathbb{P}\left( A\right) \mathbb{P}\left( B\right) = 4/ 8 \times 2/8 = 1/8. \end{aligned}\]

Case 2

If \(\mathbb{P}( \{ i \}) = i / 36 \qquad 1 \le i \le 8\), then

\[\begin{aligned} \mathbb{P}\left( A\cap B\right) &= \mathbb{P}\left( \left\{ 4\right\} \right) =4/36, \qquad \text{ and } \qquad \mathbb{P}\left( A\right) \, \mathbb{P}\left( B\right) = 10 / 36 \times 12/36. \end{aligned}\]

More than 2 independent events

Definition
We say that the events \(A_{1},A_{2},\dots\) are independent if \[\mathbb{P}\left( \bigcap_{i \in K} A_{i} \right) = \prod_{i \in K} \mathbb{P}(A_i),\] for any finite collection \(K = \{(i_1,\dots,i_k)\}\).

For example, if \(n=3,\) then, \(A_1\), \(A_2\), and \(A_3\) are independent if and only if all of the following hold:

\[\begin{aligned} \mathbb{P}\left( A_{1}\cap A_{2}\right) &= \mathbb{P}\left( A_{1}\right) \, \mathbb{P}\left( A_{2}\right), \\ \mathbb{P}\left( A_{1}\cap A_{3}\right) &= \mathbb{P}\left( A_{1}\right) \, \mathbb{P}\left( A_{3}\right),\\ \mathbb{P}\left( A_{2}\cap A_{3}\right) &= \mathbb{P}\left( A_{2}\right) \, \mathbb{P}\left( A_{3}\right),\\ \mathbb{P}\left( A_{1}\cap A_{2}\cap A_{3}\right) &= \mathbb{P}\left( A_{1}\right) \, \mathbb{P}\left( A_{2}\right) \, \mathbb{P}\left( A_{3}\right). \end{aligned}\]

Coin flipping

We flip a fair coin twice. Define three events:

  1. \(A = \{\text{first flip is H}\}\).
  2. \(B = \{\text{second flip is H}\}\).
  3. \(C = \{\text{flips show the same result}\}\).

Exercise 8
Show that \(A,B,C\) are pairwise independent, but not independent.

3 Continuity of probabilities

Continuity

Theorem
Suppose events \(A_1 \subseteq A_2 \subseteq A_3 \dots\) and \(\bigcup_{n=1}^\infty A_n= A\) for some set \(A\). Then \(\lim_{n \to \infty} \mathbb{P}(A_n) = \mathbb{P}(A)\).

Theorem
Suppose events \(A_1 \supseteq A_2 \supseteq A_3 \dots\) and \(\bigcap_{n=1}^\infty A_n= A\) for some set \(A\). Then \(\lim_{n \to \infty} \mathbb{P}(A_n) = \mathbb{P}(A)\).

Exercise 9
Recall the exercise from the first lecture:

A die is rolled repeatedly until we see a 6.

Let \(Z\) be the event that you eventually stop rolling. Show that \(\mathbb{P}(Z) = 1\).