Home page for accesible maths 3 Probability Models 3.6 The Weibull Distribution:

\operatorname{\mathsf{Weib}}(\alpha,\beta)

\operatorname{\mathsf{Cauchy}}

Style control - access keys in brackets

Font (2 3) - + Letter spacing (4 5) - + Word spacing (6 7) - + Line spacing (8 9) - +

3.7 Normal Distribution: $\operatorname{\mathsf{N}}(\mu,\sigma^{2})$

A random variable $X$ has a normal distribution if its pdf is given by

\displaystyle f_{X}(x;\boldsymbol{\theta})=\frac{1}{\sqrt{2\pi\sigma^{2}}}\exp% \left\{-\frac{(x-\mu)^{2}}{2\sigma^{2}}\right\}

for $-\infty<x<\infty$ , with parameters $\boldsymbol{\theta}=(\mu,\sigma^{2})$ . We write $X\sim N(\mu,\sigma^{2})$ . This pdf is illustrated on Figure 3.5 (First Link, Second Link). Notice that all the curves are symmetric around $\mu$ with a characteristic bell-shape, the width of which is controlled by $\sigma^{2}$ .

Figure 3.5: First Link, Second Link, Caption: The pdf for the normal distribution with two different sets of parameter values.

The normal distribution has played a central role in the history of probability and statistics. It was introduced by the French mathematician Abraham de Moivre in 1733, who used it to approximate probabilities of winning in various game of chance involving coin tossing. It was later used by the German mathematician Carl Friedrich Gauss to predict the location of astronomical bodies and became known as the Gaussian distribution.

In statistics it is by far the most important distribution. Traditionally, it has been viewed as the natural distribution of measurement errors, yields from field experiments etc. The theoretical justification for this is the central limit theorem (CLT, see Chapter 9), which says that the sum of a large number of independent random variables each of which is small compared to the sum will be approximately normally distributed. The central limit theorem is also the reason why the normal distribution often occurs as the approximate distribution of estimators in statistics.

A random variable $Z$ , which has a Normal distribution with $\mu=0$ and $\sigma=1$ is said to have the Standard Normal distribution, and, by definition, has a density of

\displaystyle\phi(z)=\frac{1}{\sqrt{2\pi}}\exp\left(-z^{2}/2\right)

for $-\infty<z<\infty$ . This pdf is shown in the top left panel of Figure 3.5 (First Link, Second Link). The corresponding cdf is

\displaystyle\Phi(z)=\operatorname{\mathsf{P}}\left({Z\leq z}\right)=\int_{-% \infty}^{z}\phi(t)\,\mathrm{d}t=\int_{-\infty}^{z}\frac{1}{\sqrt{2\pi}}\exp% \left(-t^{2}/2\right)\,\mathrm{d}t.

Values of $\Phi(z)$ can be obtained from a table of standard normal probabilities, although we will use the R function pnorm().

The following will be proved in Chapter 4, but will be useful right now in simplifying a number of our calculations.

Theorem 3.7.1.

If $X\sim\operatorname{\mathsf{N}}(\mu,\sigma^{2})$ , then the random variable

\displaystyle Z=\frac{X-\mu}{\sigma}\sim\operatorname{\mathsf{N}}(0,1)

and conversely, if $Z\sim\operatorname{\mathsf{N}}(0,1)$ , then the random variable

\displaystyle X=\mu+\sigma Z\sim\operatorname{\mathsf{N}}(\mu,\sigma^{2}).

We will now show that the normal density is indeed a density. We then find the moments of the standard normal distribution and use Theorem 3.7.1 to obtain the corresponding moments of the general normal distribution.

The density formula clearly cannot be negative. Firstly we substitute $s=(v-\mu)/\sigma$ so $\,\mathrm{d}s=\,\mathrm{d}v/\sigma$ to obtain:

	$\displaystyle\int_{-\infty}^{\infty}f_{X}(v)\,\mathrm{d}v$	$\displaystyle=\int_{-\infty}^{\infty}\frac{1}{\sqrt{2\pi\sigma^{2}}}\exp\left% \{-\frac{(v-\mu)^{2}}{2\sigma^{2}}\right\}\,\mathrm{d}v$
		$\displaystyle=\frac{1}{\sqrt{2\pi}}\int_{-\infty}^{\infty}\exp(-s^{2}/2)\,% \mathrm{d}s.$

So it is sufficient to show that $\phi(s)$ integrates to $1$ . We do this and find the moments of the standard normal distribution at the same time. For $r\geq 0$ , let

\displaystyle J(r)=\frac{1}{\sqrt{2\pi}}\int_{-\infty}^{\infty}s^{r}\exp(-s^{2% }/2)\,\mathrm{d}s.

For odd $r$ , the integrand is an odd function of $s$ and so $J(r)=0$ ; i.e. $\operatorname{\mathsf{E}}\left[{Z^{r}}\right]=0$ .

For even $r$ , the integrand is an even function of $s$ and so

\displaystyle J(r)=\frac{2}{\sqrt{2\pi}}\int_{0}^{\infty}s^{r}\exp(-s^{2}/2)\,% \mathrm{d}s=\frac{\sqrt{2}}{\sqrt{\pi}}\int_{0}^{\infty}s^{r-1}\exp(-s^{2}/2)s% \,\mathrm{d}s.

Substituting $t=s^{2}/2$ , so $\,\mathrm{d}t=s\,\mathrm{d}s$ and $s=(2t)^{1/2}$ , gives

	$\displaystyle J(r)$	$\displaystyle=\frac{\sqrt{2}}{\sqrt{\pi}}\int_{0}^{\infty}(2t)^{(r-1)/2}\exp(-% t)\,\mathrm{d}t$
		$\displaystyle=\frac{2^{r/2}}{\sqrt{\pi}}\int_{0}^{\infty}t^{(r+1)/2-1}\exp(-t)% \,\mathrm{d}t$
		$\displaystyle={\color[rgb]{0.76,0.01,0}\frac{2^{r/2}}{\sqrt{\pi}}\Gamma\left(% \frac{r+1}{2}\right).}$

With $r=0$ this is $\frac{1}{\sqrt{\pi}}\Gamma(1/2)=1$ , so the density does integrate to $1$ .

With $r=2$ it is

\displaystyle\frac{2}{\sqrt{\pi}}\Gamma(3/2)=\frac{2}{\sqrt{\phi}}\frac{1}{2}% \Gamma(1/2)=1,

so $\operatorname{\mathsf{E}}\left[{Z^{2}}\right]=1$ , and since $\operatorname{\mathsf{E}}\left[{Z}\right]=0$ , we have ${\operatorname{\mathsf{Var}}}\left[{Z}\right]=1$ .

With $r=4$ it is

\displaystyle\frac{2^{2}}{\sqrt{\pi}}\Gamma(5/2)=\frac{2^{2}}{\sqrt{\pi}}\frac% {3}{2}\frac{1}{2}\Gamma(1/2)=3,

so $\operatorname{\mathsf{E}}\left[{Z^{4}}\right]=3$ as mentioned in Chapter 2.

Since $X=\mu+\sigma Z$ ,

$\operatorname{\mathsf{E}}\left[{X}\right]=\operatorname{\mathsf{E}}\left[{\mu+% \sigma Z}\right]=\mu+\sigma\operatorname{\mathsf{E}}\left[{Z}\right]=\mu$ ,
${\operatorname{\mathsf{Var}}}\left[{X}\right]=\sigma^{2}{\operatorname{\mathsf% {Var}}}\left[{Z}\right]=\sigma^{2}$ ,

as we would expect.

Mathematically, probability statements for a general normal random variable can always be simplified to probability statements for a standard normal random variable, and can often then be written in terms of $\Phi$ . Thus, our R examples will always use the standard normal; if you use the general normal in R, note that the functions use the standard deviation $\sigma$ as an argument, not the variance $\sigma^{2}$ .

Example 3.7.1.

A normal model is proposed to model the variation in height $H$ of women with parameters $\mu=170$ and $\sigma^{2}=36$ measured in cm. Write in terms of $\Phi$ , and evaluate numerically, the probability a randomly selected woman is over 180 cm tall.

Solution. $H\sim\operatorname{\mathsf{N}}(170,36)$ , so

	$\displaystyle\operatorname{\mathsf{P}}\left({H>180}\right)$	$\displaystyle={\color[rgb]{0.76,0.01,0}\operatorname{\mathsf{P}}\left({(H-170)% /6>(180-170)/6}\right)}$
		$\displaystyle={\color[rgb]{0.76,0.01,0}\operatorname{\mathsf{P}}\left({Z>10/6}% \right)=1-\operatorname{\mathsf{P}}\left({Z<5/3}\right)}$
		$\displaystyle={\color[rgb]{0.76,0.01,0}1-\Phi(5/3)}$
		$\displaystyle={\color[rgb]{0.76,0.01,0}0.0478,}$

since 1-pnorm(5/3) gives $0.04779035$ .

Example 3.7.2.

Find the lower quartile of $X\sim N(1,9)$ .

Solution.

	$\displaystyle 0.25$	$\displaystyle=\operatorname{\mathsf{P}}\left({X\leq x_{0.25}}\right)$
		$\displaystyle={\color[rgb]{0.76,0.01,0}\operatorname{\mathsf{P}}\left({\frac{X% -1}{3}\leq\frac{x_{0.25}-1}{3}}\right)}$
		$\displaystyle={\color[rgb]{0.76,0.01,0}\operatorname{\mathsf{P}}\left({Z\leq% \frac{x_{0.25}-1}{3}}\right)=\Phi\left(\frac{x_{0.25}-1}{3}\right).}$

So $x_{0.25}=1+3\Phi^{-1}(0.25)=-1.023$ (evaluated as 1+3*qnorm(0.25)).

Example 3.7.3.

$X\sim N(5,4)$ , find $\operatorname{\mathsf{P}}\left({X^{2}<9}\right)$ in terms of $\Phi$ , the cdf of the standard normal distribution.