A random variable has a normal distribution if its pdf is given by
for , with parameters . We write . This pdf is illustrated on Figure 3.5 (First Link, Second Link). Notice that all the curves are symmetric around with a characteristic bell-shape, the width of which is controlled by .
The normal distribution has played a central role in the history of probability and statistics. It was introduced by the French mathematician Abraham de Moivre in 1733, who used it to approximate probabilities of winning in various game of chance involving coin tossing. It was later used by the German mathematician Carl Friedrich Gauss to predict the location of astronomical bodies and became known as the Gaussian distribution.
In statistics it is by far the most important distribution. Traditionally, it has been viewed as the natural distribution of measurement errors, yields from field experiments etc. The theoretical justification for this is the central limit theorem (CLT, see Chapter 9), which says that the sum of a large number of independent random variables each of which is small compared to the sum will be approximately normally distributed. The central limit theorem is also the reason why the normal distribution often occurs as the approximate distribution of estimators in statistics.
A random variable , which has a Normal distribution with and is said to have the Standard Normal distribution, and, by definition, has a density of
for . This pdf is shown in the top left panel of Figure 3.5 (First Link, Second Link). The corresponding cdf is
Values of can be obtained from a table of standard normal probabilities, although we will use the R function pnorm().
The following will be proved in Chapter 4, but will be useful right now in simplifying a number of our calculations.
If , then the random variable
and conversely, if , then the random variable
We will now show that the normal density is indeed a density. We then find the moments of the standard normal distribution and use Theorem 3.7.1 to obtain the corresponding moments of the general normal distribution.
The density formula clearly cannot be negative. Firstly we substitute so to obtain:
So it is sufficient to show that integrates to . We do this and find the moments of the standard normal distribution at the same time. For , let
For odd , the integrand is an odd function of and so ; i.e. .
For even , the integrand is an even function of and so
Substituting , so and , gives
With this is , so the density does integrate to .
With it is
so , and since , we have .
Since ,
,
,
as we would expect.
Mathematically, probability statements for a general normal random variable can always be simplified to probability statements for a standard normal random variable, and can often then be written in terms of . Thus, our R examples will always use the standard normal; if you use the general normal in R, note that the functions use the standard deviation as an argument, not the variance .
A normal model is proposed to model the variation in height of women with parameters and measured in cm. Write in terms of , and evaluate numerically, the probability a randomly selected woman is over 180 cm tall.
Solution. , so
since 1-pnorm(5/3) gives .
Find the lower quartile of .
Solution.
So (evaluated as 1+3*qnorm(0.25)).
, find in terms of , the cdf of the standard normal distribution.
Solution.