Conditional distributions, independence, and expected values
Last modified — 16 Mar 2026
We’ve seen three methods so far:
Example of adjust the CDF: \(X \sim {\mathrm{Exp}}(\lambda)\), find the distribution of \(Y = 3X\).
\[\begin{aligned} F_X(x) &= 1 - e^{-\lambda x} \quad\text{for } x > 0 & \Longrightarrow F_Y(y) &= F_X(y/3) = 1 - e^{-\lambda y/3} \quad\text{for } y > 0. \end{aligned}\]
Now, if we want the PDF of \(Y\), we can differentiate \(F_Y(y)\) to get \[f_Y(y) = \frac{\lambda}{3} e^{-\lambda y/3}I_{[0,\infty)}(y).\]
So, we hopefully recognize that \(Y \sim {\mathrm{Exp}}(\lambda/3)\).
This is really just use the definition of the distribution.
But, we can sort of make it algorithmic. If you want to find the distribution of \(Y = g(X)\), then you can do the following: \[\mathbb{P}(Y \in A) = \mathbb{P}(g(X) \in A) = \mathbb{P}(\{x : g(x) \in A\}) = \cdots\]
Example of distribution method: \(X \sim {\mathrm{Exp}}(\lambda)\), find the distribution of \(Y = 3X\).
By properties of CDF (uniquely determines the distribution), I can choose \(A = (-\infty, y]\). \[ \mathbb{P}(Y \in A) = \mathbb{P}(Y \le y) = \mathbb{P}(3X \le y) = \mathbb{P}(X \le y/3) = F_X(y/3) = 1 - e^{-\lambda y/3} \quad\text{for } y > 0. \]
This looks very similar to before. The “algebra games” were easy this time.
Only works for differentiable, strictly monotonic transformations of continuous RVs.
But, this is very algorithmic. If you want to find the distribution of \(Y = g(X)\), then you can do the following:
Let \(Y = F_X(X)\) where \(F_X\) is the CDF of \(X\). Assume that \(X\) is absolutely continuous with strictly positive density.
The joint PMF of \(W\) and \(V\) is
| \(W\ \backslash\ V\) | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 1/36 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 2 | 0 | 2/36 | 1/36 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 3 | 0 | 0 | 2/36 | 2/36 | 1/36 | 0 | 0 | 0 | 0 | 0 | 0 |
| 4 | 0 | 0 | 0 | 2/36 | 2/36 | 2/36 | 1/36 | 0 | 0 | 0 | 0 |
| 5 | 0 | 0 | 0 | 0 | 2/36 | 2/36 | 2/36 | 2/36 | 1/36 | 0 | 0 |
| 6 | 0 | 0 | 0 | 0 | 0 | 2/36 | 2/36 | 2/36 | 2/36 | 2/36 | 1/36 |
To condition on \(W\), we look at a particular row, while to condition on \(V\), we look at a particular column. Then renormalize by the sum of the row/column.
Let \(X\) and \(Y\) be jointly continuous random variables with joint PDF \[f_{X,Y}(x,y) = \frac{1}{x} e^{-x} I_{\{0 \le y \le x\}}(x,y).\]
This is the definition of independence. It says that the joint distribution factors into the product of the marginals.
But this has immediate consequences for the joint CDF, PMF, and PDF.
Choosing \(A = (-\infty, x]\) and \(B = (-\infty, y]\), we have that, if \(X\) and \(Y\) are independent, then \[\begin{aligned} F_{X,Y}(x,y) &= \mathbb{P}(X \le x, Y \le y) = \mathbb{P}(X \le x) \mathbb{P}(Y \le y) = F_X(x) F_Y(y). \end{aligned}\]
The converse is also true: if \(F_{X,Y}(x,y) = F_X(x) F_Y(y)\) for all \(x, y\), then \(X\) and \(Y\) are independent.
Let \(X\) and \(Y\) have joint pdf \[f_{X,Y}(x, y) = \begin{cases}8xy & \text{if } 0 \le x < y < 1 \\ 0 & \text{else.} \end{cases}\]
Note
The teaching team forgot that Linear Algebra is not a prerequisite for this course, so we won’t cover general multivariate transformations.
And we’ll forget the ugly multivariate Gaussian distribution. No matrices or determinants for us.
We will only cover scalar-valued functions of 2 random variables, and we will only examine special cases under independence.
Let \(X\) and \(Y\) be independent random variables with CDFs \(F_X\) and \(F_Y\).
Let \(W = \max\{X, Y\}\).
\[\begin{aligned} F_W(w) &= \mathbb{P}(W \le w) = \mathbb{P}(\max\{X, Y\} \le w)\\ & = \mathbb{P}(X \le w, Y \le w) \\ & = \mathbb{P}(X \le w)\mathbb{P}(Y \le w) \\ & = F_X(w) F_Y(w). \end{aligned}\]
This extends to the maximum of \(n\) independent random variables.
Suppose that \(X_1, X_2, \dots, X_n\) are independent random variables with common CDF \(F_X\). Let \(W = \max\{X_1, X_2, \dots, X_n\}\). Then \[\begin{aligned} F_W(w) &= \mathbb{P}(W \le w) = \mathbb{P}(X_1 \le w, X_2 \le w, \dots, X_n \le w) = (F_X(w))^n,\\ \Longrightarrow f_W(w) &= \frac{\mathsf{d}}{\mathsf{d}w} F_W(w) = n (F_X(w))^{n-1} f_X(w). \quad\quad\text{(by the chain rule if $X$ is continuous)} \end{aligned}\]
Let \(X\sim {\mathrm{Exp}}(\lambda)\) and \(Y\sim{\mathrm{Exp}}(\mu)\) be independent random variables. Find the distribution of \(U = \min\{X, Y\}\).
Hints:
To find the distribution of a sum of independent random variables, we could use the distribution method or the Jacobian method.
For this specific case, there’s also a third method called convolution.
If \(X\) and \(Y\) are both discrete random variables, then the PMF of \(U = X + Y\) is given by \[p_U(u) = \sum_{w} p_X(u-w) p_Y(w) = \sum_w p_Y(u - w) p_X(w).\]
If \(X\) and \(Y\) are both continuous random variables, then the PDF of \(U = X + Y\) is given by \[f_U(u) = \int_{-\infty}^\infty f_X(u - w) f_Y(w) \mathsf{d}w = \int_{-\infty}^\infty f_Y(u - w) f_X(w) \mathsf{d}w.\]
Let \(X\) and \(Y\) be independent \({\mathrm{Unif}}(0, 1)\) random variables. Let \(S = X + Y\). Find the distribution of \(S\).
\[\begin{aligned} f_S(s) &= \int_{-\infty}^\infty f_X(w) f_Y(s - w) \mathsf{d}w = \int_{-\infty}^\infty I_{(0,1)}(w) I_{(0,1)}(s - w) \mathsf{d}w \\ &= \int_{-\infty}^\infty I_{(0,1)}(w) I_{(s-1,s)}(w) \mathsf{d}w \\ &= \begin{cases} \int_0^s \mathsf{d}w = s & 0 < s < 1\\ \int_{s-1}^1 \mathsf{d}w = 2 - s & 1 \le s < 2\\ 0 & \text{otherwise.} \end{cases}\\ &= s I_{(0,1)}(s) + (2 - s) I_{[1,2)}(s). \end{aligned}\]
This is sometimes called the triangular distribution on \((0,2)\).
Hint: be careful with the limits of integration. Recall that if \(Z\sim {\mathrm{Exp}}(\theta)\), then \(f_Z(z) = \theta e^{-\theta z}I_{(0,\infty)}(z)\).
Stat 302 - Winter 2025/26