Home page for accesible maths 2.3 Normal distribution

Style control - access keys in brackets

Font (2 3) - + Letter spacing (4 5) - + Word spacing (6 7) - + Line spacing (8 9) - +

2.3.4 Normal probability examples

Cumulative SAT scores are approximated well by a normal model, N(μ=1500,σ2=90,000).

Example 2.3.6

Shannon is a randomly selected SAT taker, and nothing is known about Shannon’s SAT aptitude. What is the probability Shannon scores at least 1630 on her SATs?

Answer. First, always draw and label a picture of the normal distribution. (Drawings need not be exact to be useful.) We are interested in the chance she scores above 1630, so we shade this upper tail:

The picture shows the mean and the values at 2 standard deviations above and below the mean. The simplest way to find the shaded area under the curve makes use of the Z score of the cutoff value. With μ=1500, σ=300, and the cutoff value x=1630, the Z score is computed as

Z=x-μσ=1630-1500300=130300=0.43

We type pnorm(0.43) in R and it gives us (Z<0.43)=0.6664022. However, the quantile describes those who had a Z score lower than 0.43. To find the area above Z=0.43, we compute one minus the area of the lower tail:

The probability Shannon scores at least 1630 on the SAT is (Z>0.43)=0.3336.

Note: You can use R to calculate the p-value directly without standardizing to a Z-score, i.e. pnorm(1630,mu=1500,sd=300). We will encourage use of the Z score in this course as people make mistakes (inputting the variance instead of the standard deviation), Z values are useful in their own right and the probabilities given in the exam will be in terms of Z values so it is important you become familiar with calculating them.



TIP: always draw a picture first, and find the Z score second For any normal probability situation, always always always draw and label the normal curve and shade the area of interest first. The picture will provide an estimate of the probability. After drawing a figure to represent the situation, identify the Z score for the observation of interest.

Example 2.3.7

Edward earned a 1400 on his SAT. What is his quantile?

Answer. First, a picture is needed. Edward’s quantile is the proportion of people who do not get as high as a 1400. These are the scores to the left of 1400.

Answer. Identifying the mean μ=1500, the standard deviation σ=300, and the cutoff for the tail area x=1400 makes it easy to compute the Z score:

Z=x-μσ=1400-1500300=-0.33

Using R, pnorm(-0.33)=(Z<-0.33)=0.3707. Edward is at the 37th quantile.



TIP: areas to the right The pnorm() function gives the area to the left. If you would like the area to the right, first find the area to the left and then subtract this amount from one.

The last several problems have focused on finding the probability or quantile for a particular observation. What if you would like to know the observation corresponding to a particular quantile?

Based on a sample of 100 men, the heights of male adults between the ages 20 and 62 in the UK is nearly normal with mean 177.8cm and standard deviation 8.382cm.

Example 2.3.8

Erik’s height is at the 40th quantile. How tall is he?

Answer. As always, first draw the picture.

In this case, the lower tail probability is known (0.40), which can be shaded on the diagram. We want to find the observation that corresponds to this value. As a first step in this direction, we determine the Z score associated with the 40th quantile. Because the quantile is below 50%, we know Z will be negative. In R this is qnorm(0.4) = -0.2533471. In maths this is (Z<z)=0.4, i.e. which z gives a probability of 0.4.

Knowing ZErik=-0.25 and the population parameters μ=177.8 and σ=8.382 centimetres, the Z score formula can be set up to determine Erik’s unknown height, labelled xErik:

-0.25=ZErik=xErik-μσ=xErik-177.88.382

Solving for xErik yields the height 175.7cm. In R this can be done in 1 step using qnorm(0.4,mean=177.8,sd=8.382). Importantly you must know how to do this by hand for the exam.

Example 2.3.9

What is the probability that a random adult male is between 175.26 and 187.96?

Answer. First, draw the figure. The area of interest is no longer an upper or lower tail.

The total area under the curve is 1. If we find the area of the two tails that are not shaded (from Exercise 2.3.10 in the extra examples,these areas are 0.3821 and 0.1131), then we can find the middle area:

That is, the probability of being between 175.26 and 187.96 is (175.26<X<187.96)=0.5048. In R this is pnorm(187.96,mean=177.8,sd=8.382)-pnorm(175.26,mean=177.8,sd=8.382). Or you could work out the Z scores first.