Home page for accesible maths 2.8 Confidence intervals

Style control - access keys in brackets

Font (2 3) - + Letter spacing (4 5) - + Word spacing (6 7) - + Line spacing (8 9) - +

2.8.3 A sampling distribution for the mean

In Section 2.7.3, we introduced a sampling distribution for x¯, the average run time for samples of size 100. We examined this distribution earlier in Figure LABEL:netTime1000SamplingDistribution. Now we’ll take 100,000 samples, calculate the mean of each, and plot them in a histogram to get an especially accurate depiction of the sampling distribution. This histogram is shown in the left panel of Figure LABEL:netTimeBigSamplingDistribution.

See the Moodle file for the code for the simulation.

Does this distribution look familiar? Hopefully so! The distribution of sample means closely resembles the normal distribution (see Section 2.3). A normal probability plot of these sample means is shown in the right panel of Figure LABEL:netTimeBigSamplingDistribution. Because all of the points closely fall around a straight line, we can conclude the distribution of sample means is nearly normal. This result can be explained by the Central Limit Theorem.



Central Limit Theorem, informal description If a sample consists of at least 30 independent observations and the data are not strongly skewed, then the distribution of the sample mean is well approximated by a normal model.

We will apply this informal version of the Central Limit Theorem for now, and discuss its details further in Section 2.10.

The choice of using 1.96 standard errors in Equation (2.8) was based on our general guideline that roughly 95% of the time, observations are within 1.96 standard deviations of the mean. Under the normal model, this is an exact relationship. Recall that using R, qnorm(0.975) 1.96 2424We use 0.975 as we want the confidence interval to cover 95% symmetrically thus we have 2.5% left either side of the confidence interval..

point estimate± 1.96×SE (2.9)

If a point estimate, such as x¯, is associated with a normal model and standard error SE, then we use this more precise 95% confidence interval.