Module 07

Conditional distributions, independence, and expected values


TC and DJM

Last modified — 16 Mar 2026

1 More on transformations

Finding the distribution of a transformation of random variables

We’ve seen three methods so far:

  1. Adjusting the CDF of the original random variable(s).
  2. Using the distribution method.
  3. Using the Jacobian method.

Example of adjust the CDF: \(X \sim {\mathrm{Exp}}(\lambda)\), find the distribution of \(Y = 3X\).

\[\begin{aligned} F_X(x) &= 1 - e^{-\lambda x} \quad\text{for } x > 0 & \Longrightarrow F_Y(y) &= F_X(y/3) = 1 - e^{-\lambda y/3} \quad\text{for } y > 0. \end{aligned}\]

Now, if we want the PDF of \(Y\), we can differentiate \(F_Y(y)\) to get \[f_Y(y) = \frac{\lambda}{3} e^{-\lambda y/3}I_{[0,\infty)}(y).\]

So, we hopefully recognize that \(Y \sim {\mathrm{Exp}}(\lambda/3)\).

Distribution method

This is really just use the definition of the distribution.

But, we can sort of make it algorithmic. If you want to find the distribution of \(Y = g(X)\), then you can do the following: \[\mathbb{P}(Y \in A) = \mathbb{P}(g(X) \in A) = \mathbb{P}(\{x : g(x) \in A\}) = \cdots\]

  1. Start with the definition of the distribution of \(Y\).
  2. Use the definition of \(Y\) in terms of \(X\) to rewrite the probability in terms of \(X\).
  3. Use the distribution of \(X\) to compute the probability that \(g(X) \in A\).
  4. Then play algebra games to get it into a nice form; possibly “match” a known PDF or CDF.

Example of distribution method: \(X \sim {\mathrm{Exp}}(\lambda)\), find the distribution of \(Y = 3X\).

By properties of CDF (uniquely determines the distribution), I can choose \(A = (-\infty, y]\). \[ \mathbb{P}(Y \in A) = \mathbb{P}(Y \le y) = \mathbb{P}(3X \le y) = \mathbb{P}(X \le y/3) = F_X(y/3) = 1 - e^{-\lambda y/3} \quad\text{for } y > 0. \]

This looks very similar to before. The “algebra games” were easy this time.

Jacobian method

Only works for differentiable, strictly monotonic transformations of continuous RVs.

But, this is very algorithmic. If you want to find the distribution of \(Y = g(X)\), then you can do the following:

  1. Find the inverse transformation \(g^{-1}(y)\), and the derivative of the inverse transformation \(\frac{\mathsf{d}}{\mathsf{d}y} g^{-1}(y)\).
  2. Plug the inverse transformation and its derivative into the formula for the PDF of \(X\): \[f_Y(y) = f_X(g^{-1}(y)) \left|\frac{\mathsf{d}}{\mathsf{d}y} g^{-1}(y)\right|.\]

Exercise 1
Example of Jacobian method: \(X \sim {\mathrm{Exp}}(\lambda)\), find the distribution of \(Y = 3X\).

A more complicated example using the distribution method

Exercise 2

Let \(Y = F_X(X)\) where \(F_X\) is the CDF of \(X\). Assume that \(X\) is absolutely continuous with strictly positive density.

  1. What is the definition of \(F_X(x)\)? Is it monotonic? Is \(F_X(x)\) invertible?
  2. What is the range of \(Y\)?
  3. Find the distribution of \(Y\) using the distribution method.

2 Conditional distributions and independence

Conditioning on discrete RVs

Definition
Let \(X\) and \(Y\) be random variables, and suppose that \(\mathbb{P}(X = x) >0\) for some \(x\). Then the conditional distribution of \(Y\) given \(X = x\) assigns to each set \(A \subset {\mathbb{R}}\) the probability \[\mathbb{P}(Y \in A \ \vert\ X = x) = \frac{\mathbb{P}(Y \in A, X = x)}{\mathbb{P}(X = x)}.\]

Definition
If \(X\) and \(Y\) are discrete random variables, then the conditional PMF of \(Y\) given \(X = x\) is defined by \[p_{Y|X}(y \ \vert\ x) = \frac{\mathbb{P}(Y = y, X = x)}{\mathbb{P}(X = x)} = \frac{p_{X,Y}(x,y)}{p_X(x)}.\]

Conditioning on continuous RVs

Definition
If \(X\) and \(Y\) are jointly absolutely continuous random variables, then the conditional PDF of \(Y\) given \(X = x\), is the function \[f_{Y|X}(y \ \vert\ x) = \frac{f_{X,Y}(x,y)}{f_X(x)},\] valid for any \(y\in{\mathbb{R}}\), and all \(x\) such that \(f_X(x) > 0\).

Definition
Let \(X\) and \(Y\) be jointly absolutely continuous random variables with joint PDF \(f_{X,Y}\). The conditional distribution of \(Y\) given \(X = x\) assigns to each set \(A \subset {\mathbb{R}}\) the probability \[\mathbb{P}(Y \in A \ \vert\ X = x) = \int_A f_{Y|X}(y\ \vert\ x)\, \mathsf{d}y,\] valid for all \(x\) such that \(f_X(x) > 0\).

More dice examples

  • Let \(X\) and \(Y\) be the results of rolling two fair 6-sided dice.
  • Let \(V = X + Y\) and \(W = \max\{X, Y\}\).

The joint PMF of \(W\) and \(V\) is

\(W\ \backslash\ V\) 2 3 4 5 6 7 8 9 10 11 12
1 1/36 0 0 0 0 0 0 0 0 0 0
2 0 2/36 1/36 0 0 0 0 0 0 0 0
3 0 0 2/36 2/36 1/36 0 0 0 0 0 0
4 0 0 0 2/36 2/36 2/36 1/36 0 0 0 0
5 0 0 0 0 2/36 2/36 2/36 2/36 1/36 0 0
6 0 0 0 0 0 2/36 2/36 2/36 2/36 2/36 1/36

To condition on \(W\), we look at a particular row, while to condition on \(V\), we look at a particular column. Then renormalize by the sum of the row/column.

Heuristics types of distributions of random variables

  • Suppose that \(X\) and \(Y\) have a joint distribution \(\mathbb{P}(X \in A, Y \in B)\).
  • We can find the joint CDF, PMF, or PDF of \(X\) and \(Y\).
  • Both marginal and conditional distributions of \(X\) and \(Y\) “focus down”
  • The marginal distribution of \(X\) is the distribution of \(X\) when we “ignore” \(Y\).
  • The conditional distribution of \(X\) given \(Y = y\) is the distribution of \(X\) when we “fix” \(Y\) to be \(y\).
  • The marginal comes from “summing out” or “integrating out” the other variable along the rows or columns of the table.
  • The conditional comes from “dividing out” the other variable. Fix a row or a column, and then renormalize (divide out).
  • When we say \(Y\ \vert\ X=x\), we usually specify a formula. Meaning, we write down a function that gives the conditional PMF or PDF of \(Y\) given \(X = x\) for all \(x\) such that \(p_X(x) > 0\) or \(f_X(x) > 0\). Renormalizing the row of a table gives the conditional PMF for a specific value of \(x\).

Conditional PDF

Let \(X\) and \(Y\) be jointly continuous random variables with joint PDF \[f_{X,Y}(x,y) = \frac{1}{x} e^{-x} I_{\{0 \le y \le x\}}(x,y).\]

Exercise 3
  1. Find the marginal PDF of \(X\). What distribution does \(X\) have?
  2. Find the conditional PDF of \(Y\) given \(X = x\). What distribution does \(Y \ \vert\ X\) have?

Independent random variables

Theorem
\(X\) and \(Y\) are independent if and only if for any sets \(A\) and \(B\) we have \[\mathbb{P}( X \in A,\ Y \in B ) = \mathbb{P}( X \in A )\ \mathbb{P}( Y \in B ).\]

This is the definition of independence. It says that the joint distribution factors into the product of the marginals.

But this has immediate consequences for the joint CDF, PMF, and PDF.

Choosing \(A = (-\infty, x]\) and \(B = (-\infty, y]\), we have that, if \(X\) and \(Y\) are independent, then \[\begin{aligned} F_{X,Y}(x,y) &= \mathbb{P}(X \le x, Y \le y) = \mathbb{P}(X \le x) \mathbb{P}(Y \le y) = F_X(x) F_Y(y). \end{aligned}\]

The converse is also true: if \(F_{X,Y}(x,y) = F_X(x) F_Y(y)\) for all \(x, y\), then \(X\) and \(Y\) are independent.

Independent random variables, using PMFs and PDFs

Theorem
If \(X\) and \(Y\) are discrete random variables, then \(X\) and \(Y\) are independent if and only if \[p_{X,Y}(x,y) = p_X(x) p_Y(y),\] for all \(x, y \in {\mathbb{R}}\).

Theorem
If \(X\) and \(Y\) are jointly continuous random variables, then \(X\) and \(Y\) are independent if and only if their joint density can be chosen such that \[f_{X,Y}(x,y) = f_X(x) f_Y(y),\] for all \(x, y \in {\mathbb{R}}\).

  • When you hear “independent”, think “joint factors into the product of the marginals”.

Independence

Let \(X\) and \(Y\) have joint pdf \[f_{X,Y}(x, y) = \begin{cases}8xy & \text{if } 0 \le x < y < 1 \\ 0 & \text{else.} \end{cases}\]

Exercise 4
Find the marginal PDFs of \(X\) and \(Y\). Are \(X\) and \(Y\) independent?

Multivariate transformations

Note

The teaching team forgot that Linear Algebra is not a prerequisite for this course, so we won’t cover general multivariate transformations.

And we’ll forget the ugly multivariate Gaussian distribution. No matrices or determinants for us.

We will only cover scalar-valued functions of 2 random variables, and we will only examine special cases under independence.

  • We will cover the distribution method for maximum/minimum.
  • We will cover sums of independent random variables via convolution (and later via MGFs).

The maximum of two independent random variables

Let \(X\) and \(Y\) be independent random variables with CDFs \(F_X\) and \(F_Y\).

Let \(W = \max\{X, Y\}\).

  • Find the CDF of \(W\).

\[\begin{aligned} F_W(w) &= \mathbb{P}(W \le w) = \mathbb{P}(\max\{X, Y\} \le w)\\ & = \mathbb{P}(X \le w, Y \le w) \\ & = \mathbb{P}(X \le w)\mathbb{P}(Y \le w) \\ & = F_X(w) F_Y(w). \end{aligned}\]

This extends to the maximum of \(n\) independent random variables.

Suppose that \(X_1, X_2, \dots, X_n\) are independent random variables with common CDF \(F_X\). Let \(W = \max\{X_1, X_2, \dots, X_n\}\). Then \[\begin{aligned} F_W(w) &= \mathbb{P}(W \le w) = \mathbb{P}(X_1 \le w, X_2 \le w, \dots, X_n \le w) = (F_X(w))^n,\\ \Longrightarrow f_W(w) &= \frac{\mathsf{d}}{\mathsf{d}w} F_W(w) = n (F_X(w))^{n-1} f_X(w). \quad\quad\text{(by the chain rule if $X$ is continuous)} \end{aligned}\]

Minimum of two independent random variables

Exercise 5

Let \(X\sim {\mathrm{Exp}}(\lambda)\) and \(Y\sim{\mathrm{Exp}}(\mu)\) be independent random variables. Find the distribution of \(U = \min\{X, Y\}\).

Hints:

  • If \(Z\sim{\mathrm{Exp}}(\theta)\), then \(F_Z(z) = \mathbb{P}(Z \le z) = 1 - e^{-\theta z}\) for \(z > 0\).
  • Therefore, \(\mathbb{P}(Z > z) = e^{-\theta z}\) for \(z > 0\).

Sums of independent random variables

To find the distribution of a sum of independent random variables, we could use the distribution method or the Jacobian method.

For this specific case, there’s also a third method called convolution.

Theorem
Let \(X\) and \(Y\) be independent random variables.

If \(X\) and \(Y\) are both discrete random variables, then the PMF of \(U = X + Y\) is given by \[p_U(u) = \sum_{w} p_X(u-w) p_Y(w) = \sum_w p_Y(u - w) p_X(w).\]

If \(X\) and \(Y\) are both continuous random variables, then the PDF of \(U = X + Y\) is given by \[f_U(u) = \int_{-\infty}^\infty f_X(u - w) f_Y(w) \mathsf{d}w = \int_{-\infty}^\infty f_Y(u - w) f_X(w) \mathsf{d}w.\]

Sums of independent uniform random variables

Let \(X\) and \(Y\) be independent \({\mathrm{Unif}}(0, 1)\) random variables. Let \(S = X + Y\). Find the distribution of \(S\).

\[\begin{aligned} f_S(s) &= \int_{-\infty}^\infty f_X(w) f_Y(s - w) \mathsf{d}w = \int_{-\infty}^\infty I_{(0,1)}(w) I_{(0,1)}(s - w) \mathsf{d}w \\ &= \int_{-\infty}^\infty I_{(0,1)}(w) I_{(s-1,s)}(w) \mathsf{d}w \\ &= \begin{cases} \int_0^s \mathsf{d}w = s & 0 < s < 1\\ \int_{s-1}^1 \mathsf{d}w = 2 - s & 1 \le s < 2\\ 0 & \text{otherwise.} \end{cases}\\ &= s I_{(0,1)}(s) + (2 - s) I_{[1,2)}(s). \end{aligned}\]

This is sometimes called the triangular distribution on \((0,2)\).

Sum of independent Exponential random variables

Exercise 6
Let \(X\) and \(Y\) be independent \({\mathrm{Exp}}(\lambda)\) random variables,. Find the PDF of \(U = X + Y\) using the convolution method.

Hint: be careful with the limits of integration. Recall that if \(Z\sim {\mathrm{Exp}}(\theta)\), then \(f_Z(z) = \theta e^{-\theta z}I_{(0,\infty)}(z)\).