Module 03

Conditional probability and independence


TC and DJM

Last modified — 04 Feb 2026

1 Conditional probability

Conditional probability

  • In general, the outcome of a random experiment can be any element of \(\Omega\).

  • Sometimes, we have “partial information” about which elements can occur.

Example
  • Roll a die.

  • If \(A\) is the event of obtaining a “2”, then \(\mathbb{P}(A) = 1/6\).

  • But if the outcome is known to be even, then intuition suggests that \(\mathbb{P}(A) > 1/6\).

  • Conditional probability formalizes this intuition (and helps to avoid mistakes)

Conditional probability

  • Two events play distinct roles in this example:

  • The event of interest \(A = \{ 2 \}\)

  • The conditioning event \[B= \{\text{outcome is even}\} = \{2, 4, 6\}\]

  • The conditioning event captures the “partial information”

Conditional probability, formal definition

  • Let \(A, B \subseteq \Omega\) and assume \(\mathbb{P}(B) > 0\)
Definition
  • The conditional probability of \(A\) given \(B\) is \[\mathbb{P}\left(A \ \vert\ B \right) \, = \, \frac{ \mathbb{P}\left( A \cap B \right) }{ \mathbb{P}\left( B \right) }\]
  • Just as \(\mathbb{P}(\cdot)\) is a function, for any fixed \(B\), \[\mathbb{Q}_B \left( \ \cdot \ \right) \, = \, \mathbb{P}\left(\ \cdot \ \ \vert\ B \right)\] is a function. Its argument is any event \(A \subseteq \Omega\).

  • Moreover, \(\mathbb{P}\left(\ \cdot \ \ \vert\ B \right)\) satisfies the three Axioms of a Probability.

\(\mathbb{P}(\cdot | B)\) is a probability

Exercise 1
Prove that \(\mathbb{P}(\cdot | B)\) is a probability.

Formalizing our intuition

Your friend rolls a fair die once, looks at it, and tells you that the result is even.

What is \(\mathbb{P}(\{2\} \ \vert\ \text{even})\)?

  • The event of interest \(A = \{ 2 \}\)

  • The conditioning event \(B= \{\text{outcome is even}\} = \{2, 4, 6\}\)

  • Therefore, \[ \mathbb{P}(A \ \vert\ B) = \frac{\mathbb{P}(A \cap B)}{\mathbb{P}(B)} = \frac{1/6}{1/2} = 1/3 > 1/6 = \mathbb{P}(A). \]

Multiplication property

If \(\mathbb{P}(A_1) > 0\), then \[ \mathbb{P}\left( A_{1} \cap A_{2}\right) = \mathbb{P}\left(A_{2}\ \vert\ A_{1}\right) \, \mathbb{P}\left( A_{1}\right). \]

Corollary
If \(\mathbb{P}(A_1),\ \mathbb{P}(A_1 \cap A_2),\dots,\ P(A_1 \cap A_2 \cap \dots \cap A_{n-1}) > 0\), then \[\begin{aligned} \mathbb{P}\left( A_{1}\cap A_{2}\cap \cdots \cap A_{n}\right) &= \mathbb{P}\left( A_{n}\ \vert\ A_{1}\cap A_{2}\cap \cdots \cap A_{n-1}\right) \\ &\quad \times \mathbb{P}\left( A_{n-1}\ \vert\ A_{1}\cap A_{2}\cap \cdots \cap A_{n-2}\right) \\ & \quad \times \cdots \times\\ & \quad \times \mathbb{P}\left( A_{3}\ \vert\ A_{1}\cap A_{2}\right) \times \mathbb{P}\left( A_{2}\ \vert\ A_{1}\right) \times \mathbb{P}\left( A_{1}\right) \end{aligned}\]

Proof of multiplication property

Proof
To fix ideas, look at the case \(n=4\) \[\begin{aligned} & \mathbb{P}(A_{1}) \mathbb{P}(A_{2} \ \vert\ A_{1}) \mathbb{P}(A_{3} \ \vert\ A_{1} \cap A_{2}) \mathbb{P}(A_{4} \ \vert\ A_{1} \cap A_{2} \cap A_{3}) \\ &= \mathbb{P}(A_{1}) \frac{\mathbb{P}(A_{1} \cap A_{2})} {\mathbb{P}(A_{1})} \frac{\mathbb{P}(A_{1} \cap A_{2} \cap A_{3})} {\mathbb{P}(A_{1} \cap A_{2})} \frac{\mathbb{P}(A_{1} \cap A_{2} \cap A_{3}\cap A_{4})} {\mathbb{P}(A_{1} \cap A_{2} \cap A_{3})}\\ &=\mathbb{P}(A_{1} \cap A_{2} \cap A_{3} \cap A_{4}). \end{aligned}\]

The proof is the same for any \(n\). Everything cancels except what we want.

Urns and Balls

Exercise 2
  • An urn has 10 red balls and 40 black balls.

  • Three balls are randomly drawn without replacement.

Calculate the probability that:

  1. The 3rd ball is red given that the 1st is red and the 2nd is black.

  2. The first drawn ball is red, the 2nd is black and the 3rd is red.

“Total probability” formula

Definition

We say that \(B_{1}, \ldots, B_{n}\) is a partition of \(\Omega\) if

  1. They are disjoint \(B_{i}\cap B_{j} \, = \, \varnothing \quad \mbox{ for } i \ne j \, ,\)

  2. They cover the whole sample space: \(\bigcup_{i=1}^{n} B_{i} \, = \, \Omega\)

Theorem
If \(B_{1}, \ldots, B_{n}\) is a partition of \(\Omega\), then, for any \(A \in \Omega\), \[\mathbb{P}\left( A\right) =\sum_{i=1}^{n} \mathbb{P}\left( A\ \vert\ B_{i}\right) \, \mathbb{P}\left( B_{i}\right).\]

Proof of Total Probability

Proof
  • \(A = A \cap \Omega = A \cap \left( \bigcup _{i=1}^{n}B_{i}\right) = \bigcup_{i=1}^{n}\left( A \cap B_{i}\right)\)

  • The events \(\left( A \cap B_{i}\right)\) are disjoint.

  • Therefore, by Axiom 3, we have \[\begin{aligned} \mathbb{P}\left( A\right) & = \mathbb{P}\left( \bigcup_{i=1}^{n} A \cap B_{i} \right) \\ &=\sum_{i=1}^{n} \mathbb{P}\left( A\cap B_{i}\right) \\ &=\sum_{i=1}^{n} \mathbb{P}\left( A\ \vert\ B_{i}\right) \, \mathbb{P}\left( B_{i}\right). \end{aligned}\]

Flu tests

  • Suppose that every patient who visits the ER is given a flu test.
  • Suppose that 30% of patients have flu.
  • The test is known to have 90% sensitivity: a patient with flu tests positive 90% of the time.
  • The test is known to have 80% specificity: a patient without flu tests negative 80% of the time.

Exercise 3
What percentage of patients test positive for flu?

Bayes formula

Theorem
Let \(A\) and \(B\) be arbitrary sets with \(\mathbb{P}(A)>0\). Let \(B_{1}\), \(B_{2}\), …, \(B_{n}\) is a partition of \(\Omega\), then for each \(i=1,\dots,n\), we have \[\mathbb{P}\left( B \ \vert\ A\right) \, = \, \frac{\mathbb{P}\left( A\ \vert\ B \right) \, \mathbb{P}\left( B\right) }{% \sum_{j=1}^{n} \mathbb{P}\left( A\ \vert\ B_{j}\right) \, \mathbb{P}\left( B_{j}\right) }\]

Proof
\[\begin{aligned} \mathbb{P}\left( B \ \vert\ A\right) &=\frac{\mathbb{P}\left( A\cap B\right)}{\mathbb{P}\left( A\right) } & \text{(Definition of conditional prob)} \\ &=\frac{ \mathbb{P}\left( A\ \vert\ B \right) \, \mathbb{P}\left( B \right)}{\mathbb{P}\left( A\right) } & \text{(Multiplication Rule)} \\ &=\frac{\mathbb{P}\left( A\ \vert\ B \right) \, \mathbb{P}\left( B \right) }{\sum_{j=1}^{n} \mathbb{P}\left( A\ \vert\ B_{j}\right) \, \mathbb{P}\left( B_{j}\right) } & \text{(Rule of Total Prob)} \end{aligned} \]

Flu prevalence

  • Suppose that every patient who visits the ER is given a flu test.
  • Suppose that 30% of tests are positive.
  • The test is known to have 90% sensitivity: a patient with flu tests positive 90% of the time.
  • The test is known to have 80% specificity: a patient without flu tests negative 80% of the time.
Exercise 4

Suppose you bring a friend to the ER.

  1. What is the prevalence of flu?
  2. What is the probability that your friend has flu if they test positive?
  3. What is the probability that your friend has flu if they test negative?
  4. What is the probability that a test is wrong?

The Monty Hall Problem

  • You are a contestant on a game show. In front of you are three doors.
  • Behind two doors are goats. 🐐
  • Behind one door is a car. 🚗

You select a door, the host then opens one of the 2 remaining doors, revealing a goat 🐐. The host asks

Would you like to switch to the remaining closed door?

The Monty Hall Problem

Exercise 5
Show that the probability of winning the car if you switch doors is 2/3.

2 Independence

Independence

Definition
We say that events \(A\) and \(B\) are independent if \[\mathbb{P}\left( A\cap B\right) \, = \, \mathbb{P}\left( A \right) \, \mathbb{P}\left( B\right).\]

Theorem
If \(\mathbb{P}\left( B\right) >0\),

\[\mathbb{P}\left( A\ \vert\ B\right) = \frac{\mathbb{P}\left( A\cap B\right) }{\mathbb{P}\left( B\right) } = \frac{\mathbb{P}\left( A\right) \mathbb{P}\left( B\right) }{\mathbb{P}\left( B\right) } = \mathbb{P}\left( A\right).\]

  • Knowledge about \(B\) occurring does not change the probability of \(A\) and vice versa.

  • Knowledge of the occurrence of either of these events does not affect the probability of the other.

  • Thus the name: “independent events”

Independence and trivial probabilities

Definition
We say that an event \(A\) is non-trivial if \(0<P\left( A\right) <1\).

Theorem

If \(A\) and \(B\) are non-trivial events. Then,

  1. If \(A\cap B=\varnothing\) then \(A\) and \(B\) are not independent
  2. If \(A\subset B\) then \(A\) and \(B\) are not independent.

Exercise 6
Prove the theorem.