Continuous random variables
Last modified — 24 Oct 2025
A random variable \(X\) is continuous if its CDF \(F_X(x) = \mathbb{P}(X \leq x)\) is continuous in \(x\).
Stat 302 focuses on continuous random variables whose cdf can be written as \[F_X(x) = \mathbb{P}(X \leq x) = \int_{-\infty}^x f_X(t) \mathsf{d}t\] for some real function \(f_X\). Naturally, such \(f_X\) satisfies
We say that \(f_X\) is the probability density function (PDF) of \(X\).
Recall that CDFs are always right-continuous \[F_X(a) \, = \, \lim_{u \searrow a} F_X(u) \qquad \forall \, a \in {\mathbb{R}}\] where “\(u \searrow a\)” means taking the limit with \(u > a\)
For continuous random variables, the CDF is continuous, so the left- and right-limits coincide: \[F_X(a) \, = \, \lim_{u \searrow a} F_X(u) \, = \, \lim_{u \nearrow a} F_X(u) \, = \, \lim_{u \to a} F_X(u) \, , \quad \forall \, a \in \mathbb{R}\]
If \(X\) is continuous with CDF and PDF \(F_X(x)\) and \(f_X(x)\), then
Let \(X\) be continuous with cdf and pdf \(F_X(x)\) and \(f_X(x)\).
\[ \mathbb{E}[g ( X )]= \int_{-\infty }^{\infty } g(t) f_X(t) \, \mathsf{d}t \]
if the integral exists.
\[\mathbb{E}[ X ] \, = \, \int_{-\infty }^{\infty } \, t \, f_X(t)\, \mathsf{d}t.\]
Often \(f_X(t) = 0\) outside some interval \([a, b]\). If so, the integrals above can naturally be restricted to the interval \([a, b]\)
The proof is similar as for the discrete case, but now we use integrals instead of sums. To simplify the notation, call \(\mu_X = \mathbb{E}[X]\):
\[\begin{aligned} \operatorname{Var}(X) &=\int_{-\infty}^\infty (a-\mu_X)^{2} \, f_X(a) \, \mathsf{d}a = \int_{-\infty}^\infty (a^{2} + \mu_X^{2} - 2 \, a) \, f_X(a) \, \mathsf{d}a \\ & \\ &= \int_{-\infty }^{\infty} a^{2} \, f_X(a) \, \mathsf{d}a + \mu_X^{2} \overset{1}{\overbrace{\int_{-\infty }^{\infty}f_X(a) \, \mathsf{d}a}} - 2 \, \mu_X \, \overset{\mu_X}{\overbrace{\int_{-\infty }^{\infty} a \, f_X(a) \, \mathsf{d}a}} \\ & \\ &= \mathbb{E}[X^{2}] - \mu_X^{2} = \mathbb{E}[X^{2}] - (\mathbb{E}[X])^{2} \end{aligned} \]
The density function \(f_X(x)\) may be larger than 1!
\(f_X(a) \ne \mathbb{P}( X = a )\)
Because of the Fundamental Theorem of Calculus:
\[ F_X'(a) = \left. \frac{\mathsf{d}}{\mathsf{d}x} F_X \left( x \right) \right|_{x = a}\, = \, \left. \frac{\mathsf{d}}{\mathsf{d}x} \left[ \int_{-\infty}^x f_X(t) \, \mathsf{d}t \right] \right|_{x=a} \, = \, f_X(a) \]
we have
\[ f_X(x) = \lim_{ \delta \to 0+} \frac{\mathbb{P}(X \leq x+\delta) - \mathbb{P}(X\leq x)}{\delta} \]
Intuitively, regions with (relatively) higher values of \(f_X\) have higher probability
Because, for very small values of \(\delta > 0\)
\[ \mathbb{P}\left( x - \delta/2 < X < x + \delta/2 \right) \ = \ \int_{x-\delta /2}^{x+\delta/2} f_X\left( t\right) \mathsf{d}t \quad \approx \quad f_X(x) \delta \]
\[\begin{aligned} && \text{Discrete} && \text{Continuous} \\ F(x) && \sum_{k\leq x} f\left( k\right) && \int_{-\infty}^{x}f\left( t\right) \mathsf{d}t \\ & & \\ \mathbb{E}[ g(x) ] && \sum_{i \in {\cal R}_X} g\left( i\right) f\left( i\right) && \int_{-\infty }^{\infty } g\left( u\right) f\left( u\right) \mathsf{d}u\\ & & \\ \mathbb{P}( X\in B ) && \sum_{h\in B}f ( h ) && \int_{B} f(a) \mathsf{d}a \end{aligned} \]
For \(a < b\), real numbers, the \({\mathrm{Unif}}(a, b)\) distribution has density function \[f_X(x) \, = \, \begin{cases} 0 & x < a \\ 1 / \left( b - a \right) & a \le x \le b \\ 0 & x > b \end{cases}\]
More concisely:
\[ f_X \left( x \right) \, = \, \frac{1}{ \left( b - a \right)} \, \mathbf{1}_{(a, b)}(x) \] where \(\mathbf{1}_H(x)\) is the indicator function of the set \(H\):
\[ \mathbf{1}_H(x) = 1 \ \ \text{ if } \ x \in H\, , \quad \text{ and } \quad \mathbf{1}_H(x) = 0 \ \ \text{ if } \ \ x \notin H \]
\[ F_X \left( x \right) \, = \, \begin{cases} 0 & x \le a \\ \frac{x - a} {b - a} & a < x < b \\ 1 & x \ge b \end{cases} \]
\[ \mathbb{P}\left( s < X \le t \right) = \mathbb{P}\left( s < X < t \right) = \mathbb{P}\left( s \le X \le t \right) = F_X(t) - F_X(s) = \frac{t - s}{b - a} \]
(hence the name uniform)
However, for regions that go outside the range \({\cal R}_X = [a, b]\), things are a bit more delicate:
If \(t > b\) (but \(a \le s\)):
\[ \begin{aligned} \mathbb{P}\left( s < X \le t \right) = \mathbb{P}\left( s < X < t \right) = \mathbb{P}( s \le X \le t ) \, & = \, F(t) - F(s) \\ & = 1 - F(s)\\ &= \frac{b - s}{b - a}. \end{aligned} \]
\[ \mathbb{P}\left( s < X \le t \right) = \mathbb{P}\left( s < X < t \right) = \mathbb{P}\left( s \le X \le t \right) = \frac{t - a}{b - a} \]
For different values of \(a < b\) we get different uniform distributions
It is useful to study families of distributions, since many of their properties hold across all members of the family.
For example, if the random variable \(X\) has a Binomial distribution, \(Bin(n, p)\), regardless of the actual values of \(n \in \mathbb{N}\) and \(p \in [0, 1]\), we always have \(\mathbb{E}[X] = n \, p\) and \(V(X) = n \, p \, (1-p)\)
If \(X \sim {\mathrm{Unif}}(a, b)\)
\[ \begin{aligned} \mathbb{E}\left[ X\right] & = \int_{-\infty}^{+\infty} t \, f_X(t) \, \mathsf{d}t = \int_a^b t \, f_X(t) \, \mathsf{d}t = \int_a^b t \, \left( \frac{1}{b - a} \right) \, \mathsf{d}t \\ &= \frac{1}{b-a} \, \int_a^b t\, \mathsf{d}t = \left( \frac{1}{b - a} \right) \, \left[ \left. \frac{x^{2}}{2}\right\vert _{a}^{b} \right] \\ &=\frac{1}{b-a} \, \left( \frac{b^2-a^2}{2} \right) =\frac{a + b}{2} \end{aligned}\]
\[ \mathbb{E}\left[ X^2 \right] \, = \, \frac{b^{3}-a^{3}}{3\left( b - a \right) } \, = \, \frac{b^{2}+a^{2}+ a \, b }{3} \]
And thus, for any \(a < b\), we have that if \(X \sim {\cal U}(a, b)\)
\[ \operatorname{Var}(X) = \mathbb{E}\left[ X^2 \right] - \left( \mathbb{E}\left[ X \right] \right)^2 = \frac{ \left(b - a \right)^2 }{ 12 }\]
The expectation and variance formulas are applicable to all distributions in the continuous uniform distribution family.
Suppose that \(X\sim{\mathrm{Unif}}( 0, 10)\)
Calculate
\[ \mathbb{P}\left( X > 3\right) = 1 - F_X\left( 3\right) = 1-\frac{3}{10}=0.70 \]
In the second line above we used that since \(\{ X > 5 \} \subseteq \{ X > 2 \}\), we have \[ \{ X > 5 \} \cap \{ X > 2 \} = \{ X > 5 \} \]
Consider a continuous random variable \(X\) and define a new random variable \[Y \, = \, g \left( X \right)\] where \(g : \mathcal{R}_X \to {\mathbb{R}}\).
\[ \mathbb{E}\left[ Y \right] = \mathbb{E}[ g(X) ] = \int_{-\infty }^{\infty }g\left( x\right) f_X \left( x\right) \mathsf{d}x \]
(this is neither obvious, nor easy to prove)
\[ \mathbb{E}\left[ Y\right] = \mathbb{E}\left[ \frac{1}{X+1}\right] =\int_{-\infty }^{\infty } g(a) f_X(a) \mathsf{d}a = \int_{0}^{1} \left( \frac{1}{a+1} \right) \mathsf{d}a =\log \left( a+1\right) \bigg|_{0}^{1} = \, \log(2) \]
Consider a stick of length one (unit)
The stick has a mark at location \(x_0 \in [0, 1]\)
We break the stick in two pieces at a random location \(X \sim {\mathrm{Unif}}(0,1)\) along it
Let \(Y\) be the length of the piece that contains the mark
The stick is broken at the point \(X\).
Either \(X < x_0\) or \(X \ge x_0\). Then:
Therefore
\[ Y = g \left( X \right) \, = \, \begin{cases} 1 - X & X < x_0; \\ X & X \ge x_0. \end{cases} \]
\[ \begin{aligned} \mathbb{E}\left[ Y\right] &= \int_{-\infty }^{\infty }g\left( x\right) f\left( x\right) \mathsf{d}x=\int_{0}^{1}g\left( x\right) \mathsf{d}x =\int_{0}^{x_0}\left( 1-x\right) \mathsf{d}x + \int_{x_0}^{1}x\mathsf{d}x \\ &=\frac{1}{2}+x_0-x_0^{2} =\frac{1}{2}+x_0\left( 1-x_0\right). \end{aligned} \]
Define \(h(x_0) = \mathbb{E}[{Y}] = 1/2 + x_0 (1-x_0 )\) and notice that this function is at least twice continuously differentiable.
It extrema will satisfy the first order condition:
\[ \frac{\mathsf{d}}{\mathsf{d}x_0} h(x_0) = 1 - 2x_0 \overset{set}{=} 0 \quad \Longleftrightarrow \quad x_0 = 1/2 \]
If \(X\) is a random variable and \(Y = g(X)\) for some function \(g(\cdot)\), then we want to find the relationship between the CDF’s of the 2 random variables: \(F_Y\) and \(F_X\)
Standard procedure:
Find the range of \(Y\): \(\mathcal{R}_Y\).
Calculate the CDF \(F_Y(y)\) for \(y \in \mathcal{R}_Y\):
\[ F_Y(y) = \mathbb{P}\left( Y \le y \right) = \mathbb{P}\left( g \left( X \right) \le y \right) \]
Then find its PDF by taking derivative of the CDF:
\[ f_Y(y) = F'_Y(y) \]
Let \(X \sim {\mathrm{Unif}}\left( -1,1\right)\)
Find the CDF and PDF of:
Recall that
The range of \(Y = X^3\) is \(\mathcal{R}_Y = (-1, 1)\).
Hence: \(F_Y(y) = 0\) if \(y < -1\), and \(F_Y(y) = 1\) if \(y > 1\).
For \(y \in (-1, 1)\) we have \[\begin{aligned} F_y(y) & = \mathbb{P}( Y \le y ) = \mathbb{P}( X^3 \le y ) \\ &= \mathbb{P}( X \le y^{1/3} ) = \int_{-1}^{y^{1/3}} (1/2) \, \mathsf{d}t \\ & = ( y^{1/3} + 1 ) / 2. \end{aligned}\]
Sanity check: \(F_Y\) should be continuous: \(F_Y(-1) = 0\), \(F_Y(1) = 1\).
The PDF of \(Y\), \(f_Y(y) = F_Y'(y) = \frac{1}{2} \times \frac{1}{3} \times y^{1/3 - 1} = \frac{1}{6} \, y^{-2/3}\) for \(y \in (-1, 1)\), thus:
\[ f_Y(y) = F_Y'(y) = \begin{cases} 0 & y \notin (-1, 1) \\ \displaystyle \frac{y^{-2/3}} {6} & y \in (-1, 1), \end{cases}\]
Note that \(f_Y(0)\) is not defined, but still \(\int_{-1}^1 f_Y(y) \mathsf{d}y = \left. (1/2) y^{1/3} \right|_{-1}^1 = 1\)
Here \(\mathcal{R}_Z = [0, 1]\).
Thus: \(F_Z(z) = 0\) for \(z < 0\) and \(F_Z(z)=1\) for \(z > 1\).
For \(z \in (0, 1)\)
\[\begin{aligned} F_Z(z) &= \mathbb{P}(Z \leq z) = \mathbb{P}(X^2 \leq z) = \mathbb{P}(-\sqrt{z} \leq X \leq \sqrt{z}) \\ &= \int_{-\sqrt{z}}^{\sqrt{z}} (1/2) \mathsf{d}t = (1/2) \left( \Bigl. t \ \Bigr|_{-\sqrt{z}}^{\sqrt{z}} \right) = (1/2) \left( \sqrt{z} - (-\sqrt{z}) \right) = \sqrt{z} \end{aligned}\]
Sanity check: we can see that \(F_Z\) is continuous (\(F_Z(0) = 0\), \(F_Z(1)=1\)), and non-decreasing.
For the PDF: \(f_Z(a) = 0\) if \(a \notin (0, 1)\) and \[ f_Z(a) = F_Z'(a) = \frac{1}{2} a^{-1/2} = \frac{1}{2 \, \sqrt{a}} \quad \text{ for } a \in (0, 1) \]
Let \(X \sim {\mathrm{Unif}}(0,1)\) and \(Y = - \log(X)\), where \(\log( \cdot )\) is the natural logarithm function.
Find the cdf and pdf of \(Y\): \(F_Y\) and \(f_y\).
Examples: the time until
\[ F_X(x) = \begin{cases} 1 - e^{-\lambda \, x} & \text{ if } x > 0 \\ & \\ 0 & \text{otherwise} \end{cases} \]
If the CDF of a random variable \(W\) is \[ F_W(t) = 1 - e^{-3.5 \, t} \qquad \text{ for } t \ge 0 \] and \(F(t) = 0\) for \(t < 0\), then \(W \sim {\cal E}(3.5)\)
Last example revisited: if \(X \sim {\cal U}(0,1)\) and \(Y = -\log(X)\), then what is the distribution of \(Y\)? \(Y \ \sim \ ???\)
This leads to the family of exponential distributions \({\cal E}(\lambda)\), where \(\lambda > 0\) is a fixed arbitrary positive real number.
We have already encountered several families of distributions:
The set of possible values of the unknown parameters (for example: \(\lambda\), or \((n, p)\), or \((a, b)\) with \(a < b\), etc.) is called the parameter space
The PDF is \[f_X(x) = F_X'(x) = \begin{cases}\lambda \, e^{-\lambda \, x} & \text{ if } x > 0\\ 0 & \text{otherwise}\end{cases}\]
There is a connection between the Exponential distribution and the Poisson distribution. We will see that if \(X \sim {\cal P}(\lambda)\), then the (random) wait time \(T\) until the next occurrence of the event that \(X\) counts satisfies \(T \sim {\cal E}(\lambda)\).
Thus, \(\lambda\) is typically said to represent the rate of occurrence of the event being modeled. For example:
\[\begin{aligned}
\lambda & = 5 \text{ customers per hour} \\
\lambda & = 0.5 \text{ earthquakes per year}
\end{aligned}\]
\[\mathbb{E}[ X ] \, = \, \int_{-\infty}^{\infty} t \, f_X(t) \, \mathsf{d}t \, = \, \int_0^{\infty} t \, \lambda \, e^{-\lambda \, t} \, \mathsf{d}t \, = \, 1/ \lambda \]
Warning
Different textbooks and software use different parametrizations of the Exponential distribution.
We will find the CDF of \(T\): \(F_T(t)\) for \(t \in \mathbb{R}\)
\[ X_t \sim {\cal P}( \lambda \, t) \]
Also \(\{ T > t \} = \{ X_t = 0 \}\) (why?) thus \[ \mathbb{P}( T>t) = \mathbb{P}( X_t=0) = \frac{e^{-\lambda t}( \lambda t)^{0}}{0!} = e^{-\lambda t} \]
Hence: \[ F_T(t) = \mathbb{P}( T \leq t ) = 1 - \mathbb{P}( T > t) = 1 - e^{-\lambda t} \qquad \text{ for } t > 0 \]
Suppose that cases of an infectious disease in BC follow a Poisson process with rate \(\lambda =3\) per 30 days. In other words, the number of cases in any 1-month period is a random variable \(C\) with \(C \sim {\cal P}(3)\).
Let \(X\) be the number of cases per day, then \(X \sim {\cal P}( 3/30 ) = {\cal P}( 1/10 )\)
Let \(T\) be the waiting time (in days) until the first outbreak in October. Then \(T \sim {\cal E}(1/10)\)
The question asks for \(\mathbb{P}( T \ge 15)\), so: \[\begin{aligned} \mathbb{P}( T \ge 15 ) = 1 - \mathbb{P}( T \le 15 ) & = 1 - \int_{0}^{15} 0.1 \, e^{-0.1 \, t} \, \mathsf{d}t \\ & \\ &=e^{-1.5} \ \approx \ 0.22313 \end{aligned}\]
For part (b), the conditioning event is that \(T > 10\).
We need to calculate \(\mathbb{P}( T \le 20 | T > 10)\): \[\begin{aligned} \mathbb{P}( T < 20 \ \vert\ T > 10 ) &=\frac{ \mathbb{P}( \{ T<20 \} \cap \{ T>10 \} ) }{\mathbb{P}( T>10 ) } = \frac{\mathbb{P}( 10<T<20\ ) }{\mathbb{P}( T>10\ ) } \\ & \\ &= \frac{F_{T}( 20 ) - F_T ( 10 ) }{1 - F_T ( 10 ) } \\ & \\ &= \frac{e^{-0.1\times 10}-e^{-0.1\times 20}}{e^{-0.1\times 10}} = 1 - e^{-0.1\times 10} \\ & \\ &= \mathbb{P}(T < 10) \end{aligned}\]
Let \(W \sim {\cal E}(\lambda)\), for some \(\lambda > 0\).
To fix ideas, assume that \(W\) is the lifetime of a system (the time until the system fails)
\[ \mathbb{P}( W > h ) \qquad \text{and} \qquad \mathbb{P}(W > t + h \ \vert\ W > t ) \]
\[ \mathbb{P}( W > h ) = 1 - \mathbb{P}( W \le h) = e^{-\lambda \, h} \]
\[ \begin{aligned} \mathbb{P}(W > t+h \ \vert\ W> t ) &= \frac{\mathbb{P}\left( \left\{ W>t+h \right\} \cap \left\{ W > t \right\} \right) } {\mathbb{P}\left( W>t \right) } \\ &=\frac{\mathbb{P}\left( W>t+h \right) }{\mathbb{P}\left( W>t \right) } \\ &= \frac{e^{-\lambda \,( t+h) } }{e^{-\lambda \, t}} = e^{-\lambda \, h} = \mathbb{P}\left( X > h \right) \end{aligned} \] (regardless of \(t\))
If this model is appropriate
At age \(t\), the probability of surviving until age \(t+h\) is the same for any \(t\)
E.g. the probability of reaching 35 at age 30 is the same as that of reaching 95 at age 90
In other words: the system does not “get old”
This may not always be realistic (people?)
Model choices may have unintended consequences
\[ F_Z(z) = \Phi (z) \, = \, \int_{-\infty }^{z}\phi(t) \, dt \, = \, \int_{-\infty }^{z} \frac{1}{\sqrt{2\pi }} \exp(-t^{2}/2) \, dt \]
The value of \(\Phi (z)\) has to be computed numerically
In the old days, people used tables:
| \(z\) | \(\Phi (z)\) | \(z\) | \(\Phi (z)\) |
|---|---|---|---|
| 1.645 | 0.950 | -1.645 | 0.050 |
| 1.960 | 0.975 | -1.960 | 0.025 |
| 2.326 | 0.990 | -2.326 | 0.010 |
| 2.576 | 0.995 | -2.576 | 0.005 |
\[ \phi(z) = \phi(-z) \quad \text{for any } z \in \mathbb{R} \]
\[ \Phi(a) = \mathbb{P}( Z \le a) = \mathbb{P}( Z \ge -a) = 1 - \Phi(-a) \qquad \forall \, a \in {\mathbb{R}} \]
The mean of a standard normal random variable is \(\mathbb{E}[ Z ]=0\).
Proof: if \(\phi(z)\) is the density of \(Z\), then \[\phi'(z) \, = \, -z \, \phi(z) \qquad \mbox{ (prove it!)}\] and thus
\[ \begin{aligned} \mathbb{E}[ Z ] & = \int_{-\infty}^\infty z \, \phi (z) \, \mathsf{d}z \, = \, - \int_{-\infty}^\infty \phi'(z) \, \mathsf{d}z = \lim_{u \to \infty} \bigg(\phi(u) - \phi(-u)\bigg) = 0. \end{aligned} \]
\[\begin{aligned} \operatorname{Var}(Z)&= \mathbb{E}[ Z^{2}] =\int_{-\infty }^{\infty }z^{2}\phi (z)\mathsf{d}z =2\int_{0}^{\infty }z^{2}\phi (z)\mathsf{d}z =-2\int_{0}^{\infty }z\phi ^{\prime }(z)\mathsf{d}z. \end{aligned}\]
Integration by parts with \[\begin{aligned} u &=z & \mathsf{d}v&=\phi ^{\prime }(z)\mathsf{d}z \\ \mathsf{d}u &=\mathsf{d}z & v &= \phi (z) \end{aligned}\] gives \[ \operatorname{Var}\left( Z\right) = -2\biggl[ \quad \overset{0}{\overbrace{z\phi (z)\bigg|_{0}^{\infty }}} -\overset{1/2}{\overbrace{\int_{0}^{\infty }\phi (z)dz}}\quad \biggr] =1.\]
\(Z \thicksim N(0, 1)\) means
\(Z\) is a Gaussian (Normal) random variable with mean 0 and variance 1.
Equivalently:
\(Z\) is a standard normal random variable
Let \(Z\sim \mathcal{N}\left( 0,1\right)\) and, for \(\mu \in {\mathbb{R}}\), \(\sigma > 0\), define \[X=\mu +\sigma Z\]
We say \(X\) has a Gaussian distribution, and we have \[\begin{aligned} \mathbb{E}\left[ X\right] &=\mathbb{E}\left( \mu +\sigma Z\right) = \mu +\sigma \overset{0}{\text{ }\overbrace{E\left( Z\right) }}\text{ }=\mu;\\ \operatorname{Var}(X) &= \operatorname{Var}\left( \mu +\sigma Z\right) =\sigma ^{2}\overset{1}{ \text{ }\overbrace{Var\left[ Z\right] }}\text{ }=\sigma ^{2}.\end{aligned}\]
Notation: \(X \sim \mathcal{N}\left( \mu ,\sigma^2 \right)\).
The cdf of \(X\sim\mathcal{N}(\mu,\sigma^2)\) is \[\begin{aligned} F_X\left( x\right) &= \mathbb{P}\left( X\leq x\right) = \mathbb{P}\left( \frac{X-\mu }{\sigma } \leq \frac{x-\mu }{\sigma }\right) = \mathbb{P}\left( Z\leq \frac{x-\mu }{\sigma }\right) =\Phi \left( \frac{x-\mu }{\sigma }\right). \end{aligned}\]
Its pdf is given by \[\begin{aligned} f_X\left( x\right) &= F_X^{\prime }\left( x\right) =\frac{1}{\sigma }\phi \left( \frac{x-\mu }{\sigma }\right) = \frac{1}{\sigma \sqrt{2\pi }} \exp \Big \{ -\frac{1}{2}\left( \frac{x-\mu }{\sigma} \right)^{2} \Big \}. \end{aligned}\]
Warning
Often, \(Z\) is used specifically for standard normal RV’s
The following key result can be used to compute probabilities for a Gaussian distribution with any mean and any variance: \[X \thicksim \mathcal{N}\left( \mu ,\sigma ^{2}\right) \ \Longleftrightarrow \ Z = \frac{X - \mu}{\sigma} \thicksim N\left( 0, 1 \right)\] hence \[\mathbb{P}\left( X \le t \right) \, = \, F_X(t) \, = \, \Phi \left( \frac{t-\mu }{\sigma }\right)\] where \(\Phi\) is the cdf of a standard normal distribution. In other words, one only needs to be able to compute one cdf (namely: \(\Phi\))
Very good numerical approximations exist to compute the function \(\Phi\)
Let \(Z\sim \mathcal{N}\left( 0,1\right)\). Calculate
\(\mathbb{P}\left( 0.10\leq Z\leq 0.35\right)\).
\(\mathbb{P}\left( Z>1.25\right)\).
\(\mathbb{P}\left( Z>-1.20\right)\).
Find \(c\) such that \(\mathbb{P}\left( Z>c\right) =0.05\).
Find \(c\) such that \(\mathbb{P}\left( \left\vert Z\right\vert <c\right) =0.95\).
We will use R to demonstrate.
We have \[\begin{aligned} \ P\left( 0.10\leq Z\leq 0.35\right) &= \Phi \left( 0.35\right) -\Phi \left( 0.10\right) \approx 0.097. \end{aligned}\]
\[\begin{aligned} \mathbb{P}\left( Z>1.25\right) &= 1 - \mathbb{P}\left( Z\leq 1.25\right) = 1-\Phi \left( 1.25\right) \approx 0.106. \end{aligned}\]
\[\begin{aligned} \mathbb{P}\left( Z>-1.2\right) &= 1 - \mathbb{P}\left( Z\leq -1.2\right) =1-\Phi \left( -1.2\right) \approx 1- 0.115 = 0.885. \end{aligned}\]
\[\begin{aligned} 1-\Phi \left( c\right) &= 0.05 \Longleftrightarrow \Phi \left( c\right) = 0.95 \Longleftrightarrow \Phi ^{-1}\left( 0.95\right). \end{aligned}\]
\[\begin{aligned} \mathbb{P}\left( |Z| < c\right) &= \mathbb{P}\left( -c < Z < c\right) =\Phi\left( c\right) -\Phi \left( -c\right) = \Phi \left( c\right) -\left[ 1-\Phi \left( c\right) \right] \\ &= 2\Phi \left( c\right) -1 =0.95 \\ \Phi \left( c\right) &=\frac{1.95}{2}=0.975 \Longleftrightarrow c =\Phi ^{-1}\left( 0.975\right) \approx 1.96. \end{aligned}\]
A machine at a supermarket chain fills generic-brand bags of pancake mix.
The mean can be set by the machine operator.
The machine isn’t perfectly accurate. The actual amount of pancake mix it dispenses is random, but it follows a normal distribution.
Suppose that we are interested in 5kg bags of pancake mix.
What is the mean at which the machine should be set if at most 10% of the bags can be underweight? Assume that \(\sigma = 0.1\) kg.
What if \(\sigma = 0.1 \mu\)?
Let \(X\) represent the actual weight of the pancake mix in a 5kg bag.
By assumption, \(X\sim\mathcal{N}\left( \mu ,\ 0.01\right)\)
We should choose \(\mu\) so that \(\mathbb{P}\left( X < 5 \right) = 0.1\)
\[\begin{aligned} \mathbb{P}\left( X<5\right) = 0.1 &\Longleftrightarrow \quad F_X\left( 5\right) =0.1 \quad \Longleftrightarrow \quad \Phi \left( \frac{5-\mu }{0.1}\right) =0.1 \\ & \Longleftrightarrow \quad \frac{5-\mu }{0.1} = \Phi^{-1}\left( 0.1\right) \approx -1.282 & \mbox{(using \texttt{qnorm(.1)} in \texttt{R})}\\ & \Longleftrightarrow \quad \mu = 5 + 0.1\times 1.282=5.128. \end{aligned}\]
The operator should set the target mean to 5.128 kg.
Now we have \(X\sim \mathcal{N}( \mu , 0.01\mu ^{2})\).
Again, we should pick \(\mu\) so that \(P( X<5 ) = 0.1.\)
\[\begin{aligned} \mathbb{P}\left( X<5\right) = 0.1 & \Longleftrightarrow \quad F_X\left( 5\right) =0.1 \quad \Longleftrightarrow \quad \Phi \left( \frac{5-\mu }{0.1\mu }\right) = 0.1. \\ & \Longleftrightarrow \quad \frac{5-\mu }{0.1\mu } = \Phi^{-1}\left( 0.1\right) \approx -1.282 \\ & \Longleftrightarrow \quad \mu \left( 1 - 1.282 \times 0.1\right) = 5. \\ & \Longleftrightarrow \quad \mu = \frac{5}{1-0.128}\approx 5.73.\end{aligned}\]
The operator should set the target mean to 5.73 kg.
Stat 302 - Winter 2025/26