Lecture 11

Conditional Distributions and Independence


Grace Tompkins (Presented by Dr. Geoff Pleiss)

Last modified — 21 Jun 2026

Learning Outcomes

By the end of this lecture, students are anticipated to be able to:

  • Calculate conditional distributions from a joint distribution
  • Relate the notion of independence to conditional and joint distributions

1 Conditional Distributions

Conditioning on Discrete RVs

Let \(Y\) be a random variable, where \(X\) is a discrete random variable. Suppose that \(\mathbb{P}(X = x) >0\) for some \(x\). Then the conditional distribution of \(Y\) given \(X = x\) assigns to each set \(A \subset {\mathbb{R}}\) the probability \[\mathbb{P}(Y \in A \ \vert\ X = x) = \frac{\mathbb{P}(Y \in A, X = x)}{\mathbb{P}(X = x)}.\]

Here, we do not specify whether \(Y\) is continuous or discrete.

Conditioning on Discrete RVs

If \(X\) and \(Y\) are both discrete random variables, then the conditional PMF of \(Y\) given \(X = x\) is defined by \[p_{Y|X}(y \ \vert\ x) = \frac{\mathbb{P}(Y = y, X = x)}{\mathbb{P}(X = x)} = \frac{p_{X,Y}(x,y)}{p_X(x)}.\]

Conditioning on Discrete RVs

  • Let \(X\) and \(Y\) be the results of rolling two fair 6-sided dice.
  • Let \(V = X + Y\) and \(W = \max\{X, Y\}\).

The joint PMF of \(W\) and \(V\) is

\(W\ \backslash\ V\) 2 3 4 5 6 7 8 9 10 11 12
1 1/36 0 0 0 0 0 0 0 0 0 0
2 0 2/36 1/36 0 0 0 0 0 0 0 0
3 0 0 2/36 2/36 1/36 0 0 0 0 0 0
4 0 0 0 2/36 2/36 2/36 1/36 0 0 0 0
5 0 0 0 0 2/36 2/36 2/36 2/36 1/36 0 0
6 0 0 0 0 0 2/36 2/36 2/36 2/36 2/36 1/36

To condition on \(W\), we look at a particular row, while to condition on \(V\), we look at a particular column. Then renormalize by the sum of the row/column.

Conditioning on Discrete RV

\(W\ \backslash\ V\) 2 3 4 5 6 7 8 9 10 11 12
1 1/36 0 0 0 0 0 0 0 0 0 0
2 0 2/36 1/36 0 0 0 0 0 0 0 0
3 0 0 2/36 2/36 1/36 0 0 0 0 0 0
4 0 0 0 2/36 2/36 2/36 1/36 0 0 0 0
5 0 0 0 0 2/36 2/36 2/36 2/36 1/36 0 0
6 0 0 0 0 0 2/36 2/36 2/36 2/36 2/36 1/36

Given the above table, what is \(\mathbb{P}(W = 5 \ \vert\ V = 8)\)

Conditioning on Discrete RVs

Conditioning on Continuous RVs

If \(X\) and \(Y\) are jointly absolutely continuous random variables, then the conditional density of \(Y\) given \(X = x\), is the function \[f_{Y|X}(y \ \vert\ x) = \frac{f_{X,Y}(x,y)}{f_X(x)},\] valid for any \(y\in{\mathbb{R}}\), and all \(x\) such that \(f_X(x) > 0\).

Conditioning on Continuous RVs

Let \(X\) and \(Y\) be jointly absolutely continuous random variables with joint PDF \(f_{X,Y}\). The conditional distribution of \(Y\) given \(X = x\) assigns to each set \(A \subset {\mathbb{R}}\) the probability \[\mathbb{P}(a \le Y \le b \ \vert\ X = x) = \int_a^b f_{Y|X}(y\ \vert\ x)\, \mathsf{d}y,\] valid for all \(x\) such that \(f_X(x) > 0\).

Conditioning on Continuous RVs

Let \(X\) and \(Y\) be jointly continuous random variables with joint PDF \[f_{X,Y}(x,y) = \frac{1}{x} e^{-x} I_{\{0 \le y \le x\}}(x,y).\]

  1. Find the marginal PDF of \(X\). What distribution does \(X\) have?
  2. Find the conditional PDF of \(Y\) given \(X = x\). What distribution does \(Y \ \vert\ X\) have?

Conditioning on Continuous RVs

Conditioning on Continuous RVs

Let \(X\) and \(Y\) be independent with \(X \sim {\mathrm{Exp}}(1)\) and \(Y\sim {\mathrm{Exp}}(2)\). Let \(S = X + Y\). Find the conditional density \(f_{X|S}(X = x | S = s)\)

Conditioning on Continuous RVs

Conditioning on Continuous RVs

Heuristics Types of Distributions of Random Variables

  • Suppose that \(X\) and \(Y\) have a joint distribution \(\mathbb{P}(X \in A, Y \in B)\).
  • We can find the joint CDF, PMF, or PDF of \(X\) and \(Y\).
  • The marginal distribution of \(X\) is the distribution of \(X\) when we “ignore” \(Y\).
  • The conditional distribution of \(X\) given \(Y = y\) is the distribution of \(X\) when we “fix” \(Y\) to be \(y\).
  • The marginal comes from “summing out” or “integrating out” the other variable along the rows or columns of the table.
  • The conditional comes from “dividing out” the other variable. Fix a row or a column, and then renormalize (divide out).
  • When we say \(Y\ \vert\ X=x\), we usually specify a formula. Meaning, we write down a function that gives the conditional PMF or PDF of \(Y\) given \(X = x\) for all \(x\) such that \(p_X(x) > 0\) or \(f_X(x) > 0\). Renormalizing the row of a table gives the conditional PMF for a specific value of \(x\).

2 Independence

Independent Random Variables

\(X\) and \(Y\) are independent if and only if for any sets \(A\) and \(B\) we have \[\mathbb{P}( X \in A,\ Y \in B ) = \mathbb{P}( X \in A )\ \mathbb{P}( Y \in B ).\]

This is the definition of independence. It says that the joint distribution factors into the product of the marginals.

But this has immediate consequences for the joint CDF, PMF, and PDF.

Choosing \(A = (-\infty, x]\) and \(B = (-\infty, y]\), we have that, if \(X\) and \(Y\) are independent, then \[\begin{aligned} F_{X,Y}(x,y) &= \mathbb{P}(X \le x, Y \le y) = \mathbb{P}(X \le x) \mathbb{P}(Y \le y) = F_X(x) F_Y(y). \end{aligned}\]

The converse is also true: if \(F_{X,Y}(x,y) = F_X(x) F_Y(y)\) for all \(x, y\), then \(X\) and \(Y\) are independent.

Independent Random Variables, Using PMFs

If \(X\) and \(Y\) are discrete random variables, then \(X\) and \(Y\) are independent if and only if \[p_{X,Y}(x,y) = p_X(x) p_Y(y),\] for all \(x, y \in {\mathbb{R}}\).

Independent Random Variables, Using PDFs

If \(X\) and \(Y\) are jointly continuous random variables, then \(X\) and \(Y\) are independent if and only if their joint density can be chosen such that \[f_{X,Y}(x,y) = f_X(x) f_Y(y),\] for all \(x, y \in {\mathbb{R}}\).

  • Regardless of continuous or discrete, when you hear “independent”, think “joint factors into the product of the marginals”.

Independence

Let \(X\) and \(Y\) have joint pdf \[f_{X,Y}(x, y) = \begin{cases}8xy & \text{if } 0 \le x < y < 1 \\ 0 & \text{else.} \end{cases}\] Find the marginal PDFs of \(X\) and \(Y\). Are \(X\) and \(Y\) independent?

Independence

Maximum of Independent RVs

Let \(X\) and \(Y\) be independent random variables with CDFs \(F_X\) and \(F_Y\).

Let \(W = \max\{X, Y\}\).

Find the CDF of \(W\).

Maximum of Independent RVs

Maximum of Independent RVs

The previous example also extends to the maximum of \(n\) independent random variables.

Suppose that \(X_1, X_2, \dots, X_n\) are independent random variables with common CDF \(F_X\). Let \(W = \max\{X_1, X_2, \dots, X_n\}\). Then \[\begin{aligned} F_W(w) &= \mathbb{P}(W \le w) = \mathbb{P}(X_1 \le w, X_2 \le w, \dots, X_n \le w) = (F_X(w))^n,\\ \Longrightarrow f_W(w) &= \frac{\mathsf{d}}{\mathsf{d}w} F_W(w) = n (F_X(w))^{n-1} f_X(w). \quad\quad\text{(by the chain rule if $X$ is continuous)} \end{aligned}\]

Minimum of Two Independent Random Variables

Let \(X\sim {\mathrm{Exp}}(\lambda)\) and \(Y\sim{\mathrm{Exp}}(\mu)\) be independent random variables. Find the distribution of \(U = \min\{X, Y\}\).

Hints:

  • If \(Z\sim{\mathrm{Exp}}(\theta)\), then \(F_Z(z) = \mathbb{P}(Z \le z) = 1 - e^{-\theta z}\) for \(z > 0\).
  • Therefore, \(\mathbb{P}(Z > z) = e^{-\theta z}\) for \(z > 0\).

Minimum of Two Independent Random Variables

Sums of Independent Random Variables

To find the distribution of a sum of independent random variables, we could use the distribution method or the Jacobian method.

For this specific case where RVs are independent, there’s also a third method called convolution.

Let \(X\) and \(Y\) be independent random variables.

If \(X\) and \(Y\) are both discrete random variables, then the PMF of \(U = X + Y\) is given by \[p_U(u) = \sum_{w} p_X(u-w) p_Y(w) = \sum_w p_Y(u - w) p_X(w).\]

If \(X\) and \(Y\) are both continuous random variables, then the PDF of \(U = X + Y\) is given by \[f_U(u) = \int_{-\infty}^\infty f_X(u - w) f_Y(w) \mathsf{d}w = \int_{-\infty}^\infty f_Y(u - w) f_X(w) \mathsf{d}w.\]

Sums of Independent Random Variables

Let \(X\) and \(Y\) be independent \({\mathrm{Unif}}(0, 1)\) random variables. Let \(S = X +Y\). Find the distribution of \(S\).

Sum of Independent Random Variables

Sum of Independent Random Variables

Let \(X\) and \(Y\) be independent \({\mathrm{Exp}}(\lambda)\) random variables,. Find the PDF of \(U = X + Y\) using the convolution method.

Hint: be careful with the limits of integration. Recall that if \(Z\sim {\mathrm{Exp}}(\theta)\), then \(f_Z(z) = \theta e^{-\theta z}I_{(0,\infty)}(z)\).

Sum of Independent Random Variables

To Do

  • Work on Assignment 3, due Wednesday June 10, 11:59pm on Gradescope.
  • Read Chapter 3.1 and 3.2 before next class.
  • Grace will be back for your next class! Please save your questions for her and/or the TAs where possible.