Lecture 13

Covariance, Correlation, and MGFs


Grace Tompkins

Last modified — 21 Jun 2026

Learning Outcomes

By the end of this lecture, students are anticipated to be able to:

  • Define covariance and correlation
  • Calculate covariances and variances from discrete and continuous distributions
  • Define and calculate a moment generating function
  • Use the moment generating function to calculate moments

1 Covariance and Correlation

Covariance

If we have two random variables \(X\) and \(Y\), we can measure the relationship between them.

The covariance between two random variables \(X\) and \(Y\) is defined by \[\operatorname{Cov}(X, Y) = \mathbb{E}[(X - \mathbb{E}[X])(Y - \mathbb{E}[Y])].\]

  • \(\operatorname{Cov}(X, Y)\) is a scalar-valued function of \(X\) and \(Y\).
  • The covariance measures the linear relationship between \(X\) and \(Y\).
  • If \(\operatorname{Cov}(X, Y) > 0\), then \(X\) and \(Y\) tend to increase together.

Covariance

We can compute the covariance using the joint PMF/PDF of \(X\) and \(Y\).: \[\begin{aligned} \operatorname{Cov}(X, Y) &= \sum_{x} \sum_{y} (x - \mathbb{E}[X])(y - \mathbb{E}[Y]) p_{X,Y}(x, y).\\ \operatorname{Cov}(X, Y) &= \int_{-\infty}^\infty \int_{-\infty}^\infty (x - \mathbb{E}[X])(y - \mathbb{E}[Y]) f_{X,Y}(x, y) \, \mathsf{d}x \, \mathsf{d}y. \end{aligned}\]

Properties of covariance

Linearity
For any \(a, b, c \in {\mathbb{R}}\), and random variables \(X\), \(Y\), and \(Z\), \[\begin{aligned} \operatorname{Cov}(a X + b Y, c Z ) &= ac \operatorname{Cov}(X, Z) + bc \operatorname{Cov}(Y, Z). \end{aligned}\]
Easier calculation
\[\operatorname{Cov}(X, Y) = \mathbb{E}[XY] - \mathbb{E}[X]\mathbb{E}[Y].\]
Independence
If \(X\) and \(Y\) are independent, then \(\operatorname{Cov}(X, Y) = 0\). The converse is false.

Covariance

Let \(X\) be the number of heads in 5 tosses of a fair coin, and let \(Y\) be the number of tails in the same 5 tosses. Find \(\operatorname{Cov}(X, Y)\).

Hint: If \(Z\sim {\mathrm{Binom}}(n, \theta)\), then \(\operatorname{Var}(Z) = n\theta(1-\theta)\).

Covariance

Variance, Covariance, and Sums

Let \(X\) and \(Y\) be random variables with finite variances.

  • \[\operatorname{Var}(X + Y) = \operatorname{Var}(X) + \operatorname{Var}(Y) + 2\operatorname{Cov}(X, Y).\]
  • If \(X\) and \(Y\) are independent, then \[\operatorname{Var}(X + Y) = \operatorname{Var}(X) + \operatorname{Var}(Y).\]
  • More generally, if \(X_1, \ldots, X_n\) are independent random variables with finite variances, then \[\operatorname{Var}\left(\sum_{i=1}^n X_i\right) = \sum_{i=1}^n \operatorname{Var}(X_i).\]

Correlation

Covariance is not a standardized measure of the relationship between \(X\) and \(Y\).

For example, if we multiply \(X\) by 100, then \(\operatorname{Cov}(X, Y)\) will also be multiplied by 100.

The correlation between two random variables \(X\) and \(Y\) is defined by \[\rho_{XY} = \operatorname{Corr}(X, Y) = \frac{\operatorname{Cov}(X, Y)}{\sigma_X \sigma_Y}.\]

  • The correlation is a standardized measure of the linear relationship between \(X\) and \(Y\).
  • We’ll see later that \(-1 \le \rho_{XY} \le 1\).

Returning to the heads or tails example, we have that \[\begin{aligned} \rho_{XY} &= \frac{\operatorname{Cov}(X, Y)}{\sigma_X \sigma_Y} = \frac{-5/4}{\sqrt{5/4} \sqrt{5/4}} = -1. \end{aligned}\]

Correlation

Let X and Y be discrete random variables with the following joint distribution:

P(X, Y) Y = 2 Y = 4
X = 1 0.2 0.3
X = 3 0.1 0.4

Hint: \(\sigma_X=1\) and \(\sigma_Y = 0.92\).

Calculate the correlation between \(X\) and \(Y\).

Correlation

2 Moment generating functions

The Moment Generating Function

Important

We will not cover/discuss probability generating functions or characteristic functions in this course. We will not discuss:

  • (Beginning of 3.1) \(r_X(t) = \mathbb{E}[t^X]\), the probability generating function of a random variable \(X\).
  • (Section 3.1.4) \(c_X(t) = \mathbb{E}[e^{itX}]\), the characteristic function

The Moment Generating Function

The moment generating function (MGF) of a random variable \(X\) is defined by \[m_X(t) = \mathbb{E}[e^{tX}].\]

  • The MGF is a scalar-valued function of \(t\).

The Moment Generating Function

Last lecture, we saw that \(\mathbb{E}[\exp(tX)] = \left(1 - \frac{t}{\lambda}\right)^{-\alpha}\) when \(X \sim {\mathrm{Gam}}(\alpha, \lambda)\). (see “Exercise: More Gamma Expectations”). This was actually the moment generating function!

\[\begin{aligned} m_X(t) &= \mathbb{E}[e^{tX}] = \left(1 - \frac{t}{\lambda}\right)^{-\alpha}, \quad t < \lambda. \end{aligned}\]

  • \(m_X(t)\) depends on the parameters \(\alpha\) and \(\lambda\) of the distribution of \(X\).
  • It also depends on \(t\), which is a free variable that we can choose.
  • This seems like a weird object to care about, but it turns out to be very useful.

Using the MGF to Compute Moments

If \(X\) is a random variable with MGF \(m_X(t)\), and there exists \(s>0\) such that, for all \(t \in (-s, s)\), \(m_X(t)<\infty\).

Then for any integer \(k \ge 1\), \[\mathbb{E}[X^k] = m_X^{(k)}(0) = \left.\frac{\mathsf{d}^k}{\mathsf{d}t^k} m_X(t)\right|_{t=0}.\]

  • \(\mathbb{E}[X^k]\) is called the \(k\)-th moment of \(X\).
  • The MGF is called the “moment generating function” because we can use it to compute the moments of \(X\).
  • Specifically, the first moment is \(\mathbb{E}[X] = m_X'(0)\), and the second moment is \(\mathbb{E}[X^2] = m_X''(0)\).

Using the MGF to Compute Moments

Let \(X \sim {\mathrm{Gam}}(\alpha, \lambda)\) with MGF \(m_X(t) = \left(1 -\frac{t}{\lambda}\right)^{-\alpha}\). Find \(\operatorname{Var}(X)\).

Using the MGF to Compute Moments

Using the MGF to Compute Moments

Let \(X \sim \mathcal{N}(\mu, \sigma^2)\). Then,

\[m_X(t) = \mathbb{E}[e^{tX}] = \exp\left(\mu t + \frac{1}{2} \sigma^2 t^2\right).\]

Find \(\mathbb{E}[X]\) and \(\operatorname{Var}(X)\) using the MGF of \(X\).

Using the MGF to Compute Moments

Sums of Independent Random Variables

Let \(X\) and \(Y\) be independent random variables with MGFs \(m_X(t)\) and \(m_Y(t)\), respectively.

Then the MGF of \(X + Y\) is given by \[\begin{aligned} m_{X+Y}(t) &= \mathbb{E}[e^{t(X + Y)}] = \mathbb{E}[e^{tX} e^{tY}] = \mathbb{E}[e^{tX}] \mathbb{E}[e^{tY}] && \text{$X$ and $Y$ are independent} \\ &= m_X(t) m_Y(t). \end{aligned}\]

Sums of Independent Random Variables

  • We saw before how to find the PMF/PDF of \(X + Y\) using convolution. The MGF gives us an alternative way to find the distribution of \(X + Y\).
  • If \(X_1, \ldots, X_n\) are independent random variables with common MGF \(m_{X}(t)\), then the MGF of \(S_n = \sum_{i=1}^n X_i\) is given by \[m_{S_n}(t) = (m_X(t))^n.\]
  • If \(X\) has MGF \(m_X(t)\), then \(aX + b\) has MGF \(e^{bt} m_X(at)\) for any \(a, b \in {\mathbb{R}}\).

Uniqueness of the MGF

If \(X\) and \(Y\) are random variables with MGFs \(m_X(t)\) and \(m_Y(t)\), respectively, and there exists \(s > 0\) such that for all \(t \in (-s, s)\), \(m_X(t) = m_Y(t) < \infty\), then \(X\) and \(Y\) have the same distribution.

This is a very important result, as it allows us to identify the distribution of a random variable by finding its MGF.

Using These Theorems Together

Let \(X_1, \dots, X_n\) be independent identically distributed (i.i.d.) random variables with \(X_i \sim \mathcal{N}(\mu, \sigma^2)\) for all \(i\).

Note that \(m_{X_i}(t) = \exp\left(\mu t + \frac{1}{2} \sigma^2 t^2\right)\) for \(i = 1, \ldots, n\).

Find the distribution of \(\overline{X} = \frac{1}{n}\sum_{i=1}^n X_i\) using MGFs.

Using These Theorems Together

Functions of Non-Independent Random Variables

The previous theorems assume independence. Consider the scenario where \(X\) and \(Y\) are not independent.

Let \(X\) and \(Y\) be random variables that are not necessarily independent.

Then the MGF of \(h(X, Y)\) is given by \[ \begin{aligned} m_{h(X,Y)}(t) &= \mathbb{E}[e^{t\times h(X,Y)}] \end{aligned} \]

For example, if we were interested in the MGF of \(X + Y\), we would solve \(m_{X+Y}(t) = \mathbb{E}[e^{t(X+Y)}]\) directly.

Functions of Non-Independent Random Variables

Let \(X\) and \(Y\) have joint pmf:

\(X\)  \(Y\) 0 1
0 0.1 0.4
1 0.4 0.1
  1. Find the MGF of \(Z = X + Y\).

  2. Find \(\mathbb{E}[Z]\) using the MGF.

Functions of Non-Independent Random Variables

To Do

  • Work on Assignment 3, due Wednesday June 10, 11:59pm on Gradescope.
  • Read Chapter 3.5 and 3.6 before next class.