Covariance, Correlation, and MGFs
Last modified — 21 Jun 2026
By the end of this lecture, students are anticipated to be able to:
If we have two random variables \(X\) and \(Y\), we can measure the relationship between them.
The covariance between two random variables \(X\) and \(Y\) is defined by \[\operatorname{Cov}(X, Y) = \mathbb{E}[(X - \mathbb{E}[X])(Y - \mathbb{E}[Y])].\]
We can compute the covariance using the joint PMF/PDF of \(X\) and \(Y\).: \[\begin{aligned} \operatorname{Cov}(X, Y) &= \sum_{x} \sum_{y} (x - \mathbb{E}[X])(y - \mathbb{E}[Y]) p_{X,Y}(x, y).\\ \operatorname{Cov}(X, Y) &= \int_{-\infty}^\infty \int_{-\infty}^\infty (x - \mathbb{E}[X])(y - \mathbb{E}[Y]) f_{X,Y}(x, y) \, \mathsf{d}x \, \mathsf{d}y. \end{aligned}\]
Let \(X\) be the number of heads in 5 tosses of a fair coin, and let \(Y\) be the number of tails in the same 5 tosses. Find \(\operatorname{Cov}(X, Y)\).
Hint: If \(Z\sim {\mathrm{Binom}}(n, \theta)\), then \(\operatorname{Var}(Z) = n\theta(1-\theta)\).
Let \(X\) and \(Y\) be random variables with finite variances.
Covariance is not a standardized measure of the relationship between \(X\) and \(Y\).
For example, if we multiply \(X\) by 100, then \(\operatorname{Cov}(X, Y)\) will also be multiplied by 100.
The correlation between two random variables \(X\) and \(Y\) is defined by \[\rho_{XY} = \operatorname{Corr}(X, Y) = \frac{\operatorname{Cov}(X, Y)}{\sigma_X \sigma_Y}.\]
Returning to the heads or tails example, we have that \[\begin{aligned} \rho_{XY} &= \frac{\operatorname{Cov}(X, Y)}{\sigma_X \sigma_Y} = \frac{-5/4}{\sqrt{5/4} \sqrt{5/4}} = -1. \end{aligned}\]
Let X and Y be discrete random variables with the following joint distribution:
| P(X, Y) | Y = 2 | Y = 4 |
|---|---|---|
| X = 1 | 0.2 | 0.3 |
| X = 3 | 0.1 | 0.4 |
Hint: \(\sigma_X=1\) and \(\sigma_Y = 0.92\).
Calculate the correlation between \(X\) and \(Y\).
Important
We will not cover/discuss probability generating functions or characteristic functions in this course. We will not discuss:
The moment generating function (MGF) of a random variable \(X\) is defined by \[m_X(t) = \mathbb{E}[e^{tX}].\]
Last lecture, we saw that \(\mathbb{E}[\exp(tX)] = \left(1 - \frac{t}{\lambda}\right)^{-\alpha}\) when \(X \sim {\mathrm{Gam}}(\alpha, \lambda)\). (see “Exercise: More Gamma Expectations”). This was actually the moment generating function!
\[\begin{aligned} m_X(t) &= \mathbb{E}[e^{tX}] = \left(1 - \frac{t}{\lambda}\right)^{-\alpha}, \quad t < \lambda. \end{aligned}\]
If \(X\) is a random variable with MGF \(m_X(t)\), and there exists \(s>0\) such that, for all \(t \in (-s, s)\), \(m_X(t)<\infty\).
Then for any integer \(k \ge 1\), \[\mathbb{E}[X^k] = m_X^{(k)}(0) = \left.\frac{\mathsf{d}^k}{\mathsf{d}t^k} m_X(t)\right|_{t=0}.\]
Let \(X \sim {\mathrm{Gam}}(\alpha, \lambda)\) with MGF \(m_X(t) = \left(1 -\frac{t}{\lambda}\right)^{-\alpha}\). Find \(\operatorname{Var}(X)\).
Let \(X \sim \mathcal{N}(\mu, \sigma^2)\). Then,
\[m_X(t) = \mathbb{E}[e^{tX}] = \exp\left(\mu t + \frac{1}{2} \sigma^2 t^2\right).\]
Find \(\mathbb{E}[X]\) and \(\operatorname{Var}(X)\) using the MGF of \(X\).
Let \(X\) and \(Y\) be independent random variables with MGFs \(m_X(t)\) and \(m_Y(t)\), respectively.
Then the MGF of \(X + Y\) is given by \[\begin{aligned} m_{X+Y}(t) &= \mathbb{E}[e^{t(X + Y)}] = \mathbb{E}[e^{tX} e^{tY}] = \mathbb{E}[e^{tX}] \mathbb{E}[e^{tY}] && \text{$X$ and $Y$ are independent} \\ &= m_X(t) m_Y(t). \end{aligned}\]
If \(X\) and \(Y\) are random variables with MGFs \(m_X(t)\) and \(m_Y(t)\), respectively, and there exists \(s > 0\) such that for all \(t \in (-s, s)\), \(m_X(t) = m_Y(t) < \infty\), then \(X\) and \(Y\) have the same distribution.
This is a very important result, as it allows us to identify the distribution of a random variable by finding its MGF.
Let \(X_1, \dots, X_n\) be independent identically distributed (i.i.d.) random variables with \(X_i \sim \mathcal{N}(\mu, \sigma^2)\) for all \(i\).
Note that \(m_{X_i}(t) = \exp\left(\mu t + \frac{1}{2} \sigma^2 t^2\right)\) for \(i = 1, \ldots, n\).
Find the distribution of \(\overline{X} = \frac{1}{n}\sum_{i=1}^n X_i\) using MGFs.
The previous theorems assume independence. Consider the scenario where \(X\) and \(Y\) are not independent.
Let \(X\) and \(Y\) be random variables that are not necessarily independent.
Then the MGF of \(h(X, Y)\) is given by \[ \begin{aligned} m_{h(X,Y)}(t) &= \mathbb{E}[e^{t\times h(X,Y)}] \end{aligned} \]
For example, if we were interested in the MGF of \(X + Y\), we would solve \(m_{X+Y}(t) = \mathbb{E}[e^{t(X+Y)}]\) directly.
Let \(X\) and \(Y\) have joint pmf:
| \(X\) \(Y\) | 0 | 1 |
|---|---|---|
| 0 | 0.1 | 0.4 |
| 1 | 0.4 | 0.1 |
Find the MGF of \(Z = X + Y\).
Find \(\mathbb{E}[Z]\) using the MGF.
Stat 302 - Winter 2025/26