Module 08

Expected values, Variance, Correlation, and Generating functions

TC and DJM

Last modified — 21 Apr 2026

1 Expected values

Expected value of random variables

Definition

The expected value of a random variable $g(X)$ is defined by \[\mathbb{E}[g(X)] = \begin{cases} \displaystyle\sum_{x} g(x) p_X(x) & \text{if $X$ is discrete}\\ \\ \displaystyle\int_{-\infty}^\infty g(x) f_X(x) \mathsf{d}x & \text{if $X$ is absolutely continuous} \end{cases}\] provided that the sum or integral exists.

The sum in the discrete case is over all $x$ such that $p_X(x) > 0$ (countable).
The sum/integral exists when $\mathbb{E}[|g(X)|] < \infty$. Otherwise, we say that $\mathbb{E}[g(X)]$ does not exist.
Note that you do not need to know the distribution/PMF/PDF/CDF of $g(X)$ to compute $\mathbb{E}[g(X)]$, only the distribution of $X$.

Heuristic for expected value

Given a random variable $X$,

$\mathbb{E}[g(X)]$ is the weighted average value of $g(X)$

where the weights are given by the probabilities of $X$ taking on different values.

In primary school, you learned about sample averages of $n$ data points.

If we were to construct a discrete random variable $Y$ by giving each observed value the probability of $1/n$,

then $\mathbb{E}[Y] = \frac{1}{n}\sum_{i=1}^n y_i$ would be the sample average of the observed values.

Simplified calculation for Binomial

Let $X \sim {\mathrm{Binom}}(n, \theta)$. We want to compute $\mathbb{E}[X]$.

Your book (Example 3.1.7) gives a complicated calculation. But we can use “kernel matching”.

We have that \[\begin{aligned} \mathbb{E}[X] &= \sum_{x=0}^n x \binom{n}{x} \theta^x (1-\theta)^{n-x} = \sum_{x=1}^n x \binom{n}{x} \theta^x (1-\theta)^{n-x} && \text{sum is 0 when } x=0\\ &= \sum_{x=1}^n n \binom{n-1}{x-1} \theta^x (1-\theta)^{n-x} && \text{because } x\binom{n}{x} = n\binom{n-1}{x-1}\\ &= n \sum_{y=0}^{n-1} \binom{n-1}{y} \theta^{y+1} (1-\theta)^{(n-1)-y} && \text{substitute } x = y + 1 \\ &= n\theta \sum_{y=0}^{n-1} \binom{n-1}{y} \theta^y (1-\theta)^{(n-1)-y} && \text{compare to } {\mathrm{Binom}}(n-1, \theta)\\ &= n\theta. \end{aligned}\]

Gamma expectation

Let $X \sim \textrm{Gamma}(\alpha, \lambda )$. We want to compute $\mathbb{E}[X]$.

We have that \[\begin{aligned} \mathbb{E}[X] &= \int_0^\infty x \frac{\lambda^\alpha}{\Gamma(\alpha)} x^{\alpha - 1} e^{-\lambda x} \mathsf{d}x \\ &= \frac{\lambda^\alpha}{\Gamma(\alpha)} \int_0^\infty x^{\alpha} e^{-\lambda x} \mathsf{d}x \\ &= \frac{\lambda^\alpha}{\Gamma(\alpha)} \cdot \frac{\Gamma(\alpha + 1)}{\lambda^{\alpha + 1}} \int_0^\infty \frac{\lambda^{\alpha+1}}{\Gamma(\alpha+1)} x^{\alpha + 1- 1} e^{-\lambda x} \mathsf{d}x \\ &= \frac{\Gamma(\alpha + 1)}{\lambda \Gamma(\alpha)} && \text{integrates to 1 because Gamma}(\alpha + 1, \lambda)\\ &= \frac{\alpha \Gamma(\alpha)}{\lambda \Gamma(\alpha)} && \text{because } \Gamma(\alpha + 1) = \alpha \Gamma(\alpha) \\ &= \frac{\alpha}{\lambda}. \end{aligned}\]

More Gamma expectations

Let $X \sim {\mathrm{Gam}}(\alpha, \lambda)$ where $\alpha>0$ and $\lambda > 0$. Recall that the PDF of a RV $Y\sim{\mathrm{Gam}}(\theta, \beta)$ is given by \[f_X(x) = \frac{\beta^\theta}{\Gamma(\theta)} x^{\theta - 1} e^{-\beta x} I_{(0,\infty)}(x).\]

Exercise 1

Let $t < \lambda$. Find $\mathbb{E}[\exp(tX)]$.

Important properties

(Where the expected value exists.)

Linearity: for any $a, b, c \in {\mathbb{R}}$, any functions $g$ and $h$, and any random variables $X$ and $Y$. \[\mathbb{E}[a g(X) + b h(Y) + c] = a \mathbb{E}[g(X)] + b \mathbb{E}[h(Y)] + c\]
Boundedness: If $a< g(x) < b$ for all $x$ in the support of $X$, then $a < \mathbb{E}[g(X)] < b.$
Monotonicity: If $g(x) \le h(x)$ for all $x$ in the support of $X$, then $\mathbb{E}[g(X)] \le \mathbb{E}[h(X)].$
Independence: If $X$ and $Y$ are independent, then \[\mathbb{E}[g(X) h(Y)] = \mathbb{E}[g(X)] \mathbb{E}[h(Y)].\]

For the last property, the converse is false.

Expectation of a function of two random variables

Exercise 2

Let $X\sim U(0,\theta)$ and $Y\sim{\mathrm{Exp}}(1)$ be independent.

Find $\mathbb{E}\left[\frac{1}{2}(X + Y)^2\right]$.

Scalar-valued functions of multiple random variables

Theorem

Let $g : {\mathbb{R}}^2 \to {\mathbb{R}}$ be a function.

If $X$ and $Y$ are both discrete random variables, then \[\begin{aligned} \mathbb{E}[g(X, Y)] &= \sum_{x} \sum_{y} g(x, y) p_{X,Y}(x, y). \end{aligned}\] If $X$ and $Y$ are jointly absolutely continuous random variables, then \[\begin{aligned} \mathbb{E}[g(X, Y)] &= \int_{-\infty}^\infty \int_{-\infty}^\infty g(x, y) f_{X,Y}(x, y) \mathsf{d}x \mathsf{d}y. \end{aligned}\]

Product of expectations

If $X$ and $Y$ are independent, then $\mathbb{E}[g(X)h(Y)] = \mathbb{E}[g(X)]\mathbb{E}[h(Y)]$.

Proof

Suppose that $X$ and $Y$ are jointly absolutely continuous random variables with joint PDF $f_{X,Y}(x, y)$. Note that $\mathbb{E}[g(X)h(Y)]$ is a scalar-valued function of $X$ and $Y$.

\[\begin{aligned} \mathbb{E}[g(X)h(Y)] &= \int_{-\infty}^\infty \int_{-\infty} ^\infty g(x) h(y) f_{X,Y}(x, y) \mathsf{d}x \mathsf{d}y \\ &= \int_{-\infty}^\infty \int_{-\infty}^\infty g(x) h(y) f_X(x) f_Y(y) \mathsf{d}x\mathsf{d}y && \text{$X$ and $Y$ are independent} \\ &= \int_{-\infty}^\infty g(x) f_X(x) \mathsf{d}x \int_{-\infty}^\infty h(y) f_Y(y) \mathsf{d}y \\ &= \mathbb{E}[g(X)] \mathbb{E}[h(Y)]. \end{aligned}\]

Computing expectations using the joint distribution

Let $X$ and $Y$ have joint PDF \[f_{X,Y}(x,y) = 8xyI_{\{0 < x < y < 1\}}(x,y).\]

Exercise 3

Find $\mathbb{E}[X]$, $\mathbb{E}[Y]$, and $\mathbb{E}[XY]$.

2 Variance, covariance, and correlation

Variance

Definition

The variance of a random variable $X$ is defined by \[\sigma^2_X = \operatorname{Var}(X) = \mathbb{E}[(X - \mathbb{E}[X])^2].\]

Careful: $\mathbb{E}[X]$ is a number, not a random variable.
The variance is a measure of the spread of the distribution of $X$ around its mean $\mathbb{E}[X]$.
Note that $g(X) = (X - \mathbb{E}[X])^2$ is a function of $X$, so we can compute $\operatorname{Var}(X)$ using the definition of expected value.
The “units” of $\operatorname{Var}(X)$ are the square of the units of $X$.

Definition

The standard deviation of a random variable $X$ is defined by $\sigma_X = \sqrt{\operatorname{Var}(X)}.$

Properties of variance

(Where the variance exists.)

Scaling: For any $a \in {\mathbb{R}}$, $\operatorname{Var}(aX) = a^2 \operatorname{Var}(X)$.
Shift invariance: For any $a \in {\mathbb{R}}$, $\operatorname{Var}(X + a) = \operatorname{Var}(X)$.
Non-negativity: $\operatorname{Var}(X) \ge 0$.

Tip

\[\begin{aligned} \operatorname{Var}(X) &= \mathbb{E}[(X - \mathbb{E}[X])^2] \\ &= \mathbb{E}[X^2 - 2X\mathbb{E}[X] + \mathbb{E}[X]^2] \\ &= \mathbb{E}[X^2] - 2\mathbb{E}[X]\mathbb{E}[X] + \mathbb{E}[X]^2 \\ &= \mathbb{E}[X^2] - \mathbb{E}[X]^2.\\ \Longrightarrow \operatorname{Var}(X) &\leq \mathbb{E}[X^2]. \end{aligned}\]

Exponential variance

Exercise 4

Let $X \sim {\mathrm{Exp}}(\lambda)$. Find $\operatorname{Var}(X)$.

Hints: remember that $\mathbb{E}[X] = 1/\lambda$ and that $\Gamma(z) = (z-1)!$ for integer $z > 1$.

3 Covariance and correlation

Covariance

If we have two random variables $X$ and $Y$, we can measure the relationship between them.

Definition

The covariance between two random variables $X$ and $Y$ is defined by \[\operatorname{Cov}(X, Y) = \mathbb{E}[(X - \mathbb{E}[X])(Y - \mathbb{E}[Y])].\]

$\operatorname{Cov}(X, Y)$ is a scalar-valued function of $X$ and $Y$.
The covariance measures the linear relationship between $X$ and $Y$.
If $\operatorname{Cov}(X, Y) > 0$, then $X$ and $Y$ tend to increase together.

We can compute the covariance using the joint PMF/PDF of $X$ and $Y$.: \[\begin{aligned} \operatorname{Cov}(X, Y) &= \sum_{x} \sum_{y} (x - \mathbb{E}[X])(y - \mathbb{E}[Y]) p_{X,Y}(x, y).\\ \operatorname{Cov}(X, Y) &= \int_{-\infty}^\infty \int_{-\infty}^\infty (x - \mathbb{E}[X])(y - \mathbb{E}[Y]) f_{X,Y}(x, y) \, \mathsf{d}x \, \mathsf{d}y. \end{aligned}\]

Properties of covariance

Linearity: For any $a, b, c \in {\mathbb{R}}$, and random variables $X$, $Y$, and $Z$, \[\begin{aligned} \operatorname{Cov}(a X + b Y, c Z ) &= ac \operatorname{Cov}(X, Z) + bc \operatorname{Cov}(Y, Z). \end{aligned}\]
Easier calculation: \[\operatorname{Cov}(X, Y) = \mathbb{E}[XY] - \mathbb{E}[X]\mathbb{E}[Y].\]
Independence: If $X$ and $Y$ are independent, then $\operatorname{Cov}(X, Y) = 0$. The converse is false.

Heads or tails?

Exercise 5

Let $X$ be the number of heads in 5 tosses of a fair coin, and let $Y$ be the number of tails in the same 5 tosses. Find $\operatorname{Cov}(X, Y)$.

Hint: If $Z\sim {\mathrm{Binom}}(n, \theta)$, then $\operatorname{Var}(Z) = n\theta(1-\theta)$.

Variance, covariance, and sums

Let $X$ and $Y$ be random variables with finite variances.

\[\operatorname{Var}(X + Y) = \operatorname{Var}(X) + \operatorname{Var}(Y) + 2\operatorname{Cov}(X, Y).\]
If $X$ and $Y$ are independent, then \[\operatorname{Var}(X + Y) = \operatorname{Var}(X) + \operatorname{Var}(Y).\]
More generally, if $X_1, \ldots, X_n$ are independent random variables with finite variances, then \[\operatorname{Var}\left(\sum_{i=1}^n X_i\right) = \sum_{i=1}^n \operatorname{Var}(X_i).\]

Correlation

Covariance is not a standardized measure of the relationship between $X$ and $Y$.

For example, if we multiply $X$ by 100, then $\operatorname{Cov}(X, Y)$ will also be multiplied by 100.

Definition

The correlation between two random variables $X$ and $Y$ is defined by \[\rho_{XY} = \operatorname{Corr}(X, Y) = \frac{\operatorname{Cov}(X, Y)}{\sigma_X \sigma_Y}.\]

The correlation is a standardized measure of the linear relationship between $X$ and $Y$.
We’ll see later that $-1 \le \rho_{XY} \le 1$.

Returning to the heads or tails example, we have that \[\begin{aligned} \rho_{XY} &= \frac{\operatorname{Cov}(X, Y)}{\sigma_X \sigma_Y} = \frac{-5/4}{\sqrt{5/4} \sqrt{5/4}} = -1. \end{aligned}\]

End of material for Midterm 2.