Module 8

Moments and moment generating functions

Matias Salibian Barrera

Last modified — 06 Dec 2025

Moments

Let \(X\) be a random variable, its k-th moment is \[ \mu_k = \mathbb{E}[ X^k ] \qquad k = 1, 2, \ldots \] (when the expectations exist)
For example, if \(X \sim \mathcal{U}(0, 1)\) then \(\mu_1 = \mathbb{E}[ X ] = 1/2\), and \[ \begin{aligned} \mu_2 &= \mathbb{E}\left[ X^{2}\right] =\int_{0}^{1}x^{2}dx=\left. \frac{x^{3}}{3} \right\vert _{0}^{1}=\frac{1}{3} \\ & \cdots \\ \mu_k &= \mathbb{E}(\, X^k )=\int_{0}^{1} x^{k} dx = \left. \frac{x^{k+1}}{k+1} \right\vert _{0}^{1}=\frac{1}{k+1} \end{aligned} \]

Moment Generating Function (MGF)

The moment generating function of a random variable \(X\) is \[ M_{X}\left( t\right) = E\left[ \, e^{t \, X} \, \right] \] for the values of \(t \in \mathbb{R}\) for which the integral exists and is finite
Why would anyone want to study such a thing?
One goal is to have a tool to characterize (identify) distributions (without having to derive their pdf / pmf / cdf)
Ideally, if \(X\) and \(Y\) have the same MGF, then they have the same distribution
Why call it “Moment Generating Function” then?

MGF example

If \(X \sim {Bin}(n, p)\), then, for any \(t \in \mathbb{R}\), \[ \begin{aligned} M_X(t) &= \mathbb{E}[ e^{t X} ] = \sum_{k=0}^n e^{t \, k} \binom{n}{k} p^k \, (1 - p)^{n-k} \\ & \\ &= \sum_{k=0}^n \binom{n}{k} (e^t \, p)^k \, (1 - p)^{n-k} \\ & \\ &= \left( e^t \, p + 1 - p \right)^n \end{aligned} \] (using the Binomial Theorem)

Moment Generating Function

Theorem: if there exists \(\epsilon > 0\) such that \(M_X(t) < \infty\) for all \(t \in [-\epsilon, \epsilon]\), then:

\(\mathbb{E}[X^k] < \infty\) for all \(k \in \mathbb{N}\);
\(M_X(t)\) is real analytic (and thus \({\cal C}^\infty\)): \[ M_X(t) = \sum_{k = 0}^\infty \mathbb{E}[ X^k ] \, t^k / k! \]
Also \[ \mathbb{E}[ X^k ] = \left. \frac{d^k}{dt^k} M_X(t) \right|_{t=0} \qquad \text{for } \quad k = 1, 2, \ldots \]

Moment Generating Function

Item (c) from the Theorem justifies the name “Moment Generating Function”
We have \(\mathbb{E}[ X ] = \left. M_X'(t) \right|_{t = 0}\)
For example, if \(X \sim Bin(n, p)\), then, \(M_X(t) = (e^t p + 1 - p)^n\), thus: \[ M_X'(t) = n \, \left(e^t p + 1 - p \right)^{n-1} \, e^t \, p \] and \[ \mathbb{E}[X] = M_X'(0) = n \, \left(e^0 p + 1 - p \right)^{n-1} \, e^0 \, p = n \, p \]

Moment Generating Function

Calculations can get tedious, for example, also in the \(Bin(n,p)\) case:

\[ M_X''(t) = n \, \left( (n - 1) \, \left(e^t p + 1 - p \right)^{n-2} \, e^{2t} \, p^2 + \left(e^t p + 1 - p \right)^{n-1} \, e^t \, p \right) \] and thus \[ \begin{aligned} \mathbb{E}[ X^2 ] &= M_X''(0) = n \, \left( (n - 1) \, \left(e^0 p + 1 - p \right)^{n-2} \, e^{0} \, p^2 + \left(e^0 p + 1 - p \right)^{n-1} \, e^0 \, p \right) \\ & \\ &= n \left( (n-1)p^2 + p \right) \end{aligned} \]

Another example

If \(X \sim {\cal U}(a, b)\) then, for any \(t \in \mathbb{R}\) \[ M_X(t) = \int_a^b e^{t \, u} \left( \frac{1}{b-a} \right) \, du = \left( \frac{1}{t} \right) \, \left( \frac{1}{b-a} \right) \, \left( e^{tb} - e^{ta} \right) \]
Does this MGF exist for all \(t\)? (including \(t = 0\))? Note that in general we should have \[ M_X(0) = \mathbb{E}[ e^{0 \, X} ] = \mathbb{E}[ 1 ] = 1 \] Does the \(M_X(t)\) above satisfy this?
If \(X \sim {\cal N}(\mu, \sigma^2)\) then \[ M_X(t) = e^{\left( t^2 \, \sigma^2 / 2 + t \, \mu \right)} \] (check the book)

The MGF characterizes the distribution

Theorem: Suppose that \(X\) and \(Y\) have MGFs \(M_X\) and \(M_Y\) that are finite for all \(t \in [-\epsilon, \epsilon]\) for some \(\epsilon > 0\). If \[ M_X(t) = M_Y(t) \qquad \text{ for all } \quad t \in [-\epsilon, \epsilon] \] then \(X\) and \(Y\) have the same distribution (e.g. \(F_X(a) = F_Y(a)\) for all \(a \in \mathbb{R}\)).
For example, if \(X \sim Bin(n, p)\) and \(Y \sim Bin(m, p)\) (same \(p\)!), and \(X\) and \(Y\) are independent, what is the distribution of \(X + Y\)?

MGF of sums of independent random variables

Let \(X\) and \(Y\) be any two independent random variables, let \(Z = X + Y\), and \(t\) be such that \(M_X(t)\), \(M_Y(t)\) and \(M_Z(t)\) exist and are finite, then \[ M_Z(t) = M_{X+Y}(t) = M_X(t) \, M_Y(t) \] because \[ \begin{aligned} M_Z(t) &= \mathbb{E}[ e^{t Z} ] = \mathbb{E}[ e^{t (X + Y)} ] \\ & \\ &= \mathbb{E}[ e^{t\, X} \, e^{t \, Y} ] = \mathbb{E}[ e^{t\, X}] \, \mathbb{E}[e^{t \, Y} ] \\ & \\ &= M_X(t) \, M_Y(t) \end{aligned} \]

Example

Thus, if \(X \sim Bin(n, p)\) and \(Y \sim Bin(m, p)\) (same \(p\)!), we know that for any \(t \in \mathbb{R}\): \[ \begin{aligned} M_X(t) &= \left( e^t \, p + 1 - p \right)^n \\ & \\ M_Y(t) &= \left( e^t \, p + 1 - p \right)^m \end{aligned} \]
If \(Z = X + Y\) and \(X\) and \(Y\) are independent, then the result above gives us \[ M_Z(t) = M_X(t) \, M_Y(t) = \left( e^t \, p + 1 - p \right)^{n + m} \] which is the MGF Of a \(Bin(n+m, p)\) distribution, thus \[ X + Y \ \sim \ Bin(n + m, p ) \]

Sum of more than 2 independent random variables

Suppose \(X_1, X_2, \ldots, X_n\) are independent random variables and \(S_n=X_{1}+X_{2}+\cdots +X_{n} = \sum_{j=1}^n X_j\) then, \[ M_{S_n}(t) = M_{X_1}(t) \, M_{X_2}(t) \, \cdots \, M_{X_n}(t) \, = \, \prod_{j=1}^n \, M_{X_j}(t) \] (you can prove this by induction, for example)
For example: if \(X_{1}, X_{2}, \ldots, X_{m}\) are independent, and \(X_{i}\sim Bin\left( n_{i}, p\right)\) and \(S_m = X_{1}+X_{2}+\cdots +X_{m}\), then \[ S_m \sim Bin\left( n_{1}+n_{2}+\cdots +n_{m},p\right) \] (prove it!)

MGF of a Poisson random variable

Let \(X \sim {\cal P}(\lambda)\), with \(\lambda > 0\), then \[ M_X(t) = e^{ \lambda \left( e^t - 1 \right)} \] (prove it!)
If \(X_j \sim {\cal P}(\lambda_j)\) are independent, \(1 \le j \le n\), then \[ S_n = \sum_{j=1}^n X_j \, \sim \, {\cal P} \left( \sum_{j=1}^n \lambda_j \right) \] (prove it!)

Infinite divisibility

Using the last result, you can show that if \(W \sim {\cal P}(\lambda)\), then \(W\) has the same distribution as the sum of \(n\) independent random variables, all with the same Poisson distribution.
In symbols: for any \(n \in \mathbb{N}\), there exist \(T_1, \ldots, T_n\), all independent and with the same Poisson distribution such that

\[ W \sim \sum_{\ell=1}^n T_\ell \] (the equation above means that the random variable on the left hand side has the same distribution as the one on the right hand side)

Example: sum of independent Normal random variables

Suppose \(X_{1} \sim {\cal N}( \mu_1, \sigma_{1}^{2})\) and \(X_{2} \sim {\cal N}(\mu_2, \sigma_2^2)\), and that they are independent, then \[ S = X_1 + X_2 \sim {\cal N}( \mu_1 + \mu_2, \sigma_1^2 + \sigma_2^2) \]
We prove it using MGFs: \[ \begin{aligned} M_S(t) &= M_{X_1+X_2}(t) = M_{X_1}(t) \, M_{X_2}(t) \\ & \\ &= e^{t^2 \, \sigma_1^2 /2 + t \, \mu_1} \, e^{t^2 \, \sigma_2^2 /2 + t \, \mu_2} \\ & \\ &= e^{t^2 \, (\sigma_1^2 + \sigma_2^2) /2 + t \, (\mu_1 + \mu_2)} \end{aligned} \] which is the MGF of a \({\cal N}(\mu_1 + \mu_2, \sigma_1^2 + \sigma_2^2)\) random variable

MGF for \(Y = a + b\, X\)

Suppose that \(X\) is a random variable with MGF \(M_{X}\left( t\right)\)

Let \(Y = a + b\, X\), where \(a, b \in \mathbb{R}\), then

\[ \begin{aligned} M_Y(t) & = \mathbb{E}[ e^{tY} ] = \mathbb{E}[ e^{t \, ( a + b \, X )} ] \\ & \\ &= \mathbb{E}[ e^{t \, a} \, e^{t \, b \, X} ] \\ & \\ &= e^{t \, a} \mathbb{E}[ e^{b \, t \, X} ] = e^{t \, a} \, M_{X} ( b \, t ) \end{aligned} \]

Property of the Normal distribution

Suppose \(X_{1}, X_{2}, \ldots, X_{m}\) are independent random variables with \(X_{i} \sim {\cal N} ( \mu _{i}, \sigma _{i}^{2} )\), \(1 \le i \le m\), and \(a_i \in \mathbb{R}\), \(0 \le i \le m\), then \[ S=a_{0}+a_{1}X_{1}+a_{2}X_{2}+\cdots +a_{m}X_{m} \quad \sim {\cal N} \left( a_{0}+\sum_{i=1}^{m}a_{i}\mu _{i}, \ \sum_{j=1}^{m}a_{j}^{2}\sigma_{j}^{2} \right) \] (prove it!)

The distribution of the sample mean

If \(X_j\) are independent random variables with the same distribution (and thus, the same MGF \(M_X(t)\)), and let \[ \bar{X}_n = \frac{1}{n} \, \sum_{j=1}^n X_j \] then \[ M_{\bar{X}_n}(t) = \left( M_X(t/n) \right)^n \] This can be used to prove the Central Limit Theorem

Another example

Let \(X \sim {\cal P}(\lambda)\) and \(Y | X = n \sim Bin(n, p)\)
We can use MGFs to find the distribution of \(Y\). For example

\[ \begin{aligned} M_Y(t) &= \mathbb{E}[ e^{t \, Y} ] = \mathbb{E}[ \, \mathbb{E}[ e^{t \, Y} | X ] ] \\ & \\ &= \mathbb{E}\left[ \left(e^t \, p + 1 - p \right)^X \right] \\ & \\ &= e^{\lambda \, p \left( e^t - 1 \right)} \end{aligned} \] which is the MGF of a \({\cal P}(\lambda \, p)\) distribution

You need to verify that if \(X \sim {\cal P}(\lambda)\) then

\[ \mathbb{E}\left[ \left(e^t \, p + 1 - p \right)^X \right] = e^{\lambda \, p \left( e^t - 1 \right)} \]