Home page for accesible maths 2.7 Variability in estimates

Style control - access keys in brackets

Font (2 3) - + Letter spacing (4 5) - + Word spacing (6 7) - + Line spacing (8 9) - +

2.7.3 Standard error of the mean

From the random sample represented in LonMar13Samp, we guessed the average time it takes to run 26 miles is 273.4978 minutes. Suppose we take another random sample of 100 individuals and take its mean: 276.8065 minutes. Suppose we took another (272.7993 minutes) and another (271.1928 minutes), and so on. If we do this many many times – which we can do only because we have the entire population data set – we can build up a sampling distribution for the sample mean when the sample size is 100, shown in Figure LABEL:netTime1000SamplingDistribution.

See the Moodle file for the code for the simulation.



Sampling distribution The sampling distribution represents the distribution of the point estimates based on samples of a fixed size from a certain population. It is useful to think of a particular point estimate as being drawn from such a distribution. Understanding the concept of a sampling distribution is central to understanding statistical inference.

The sampling distribution shown in Figure LABEL:netTime1000SamplingDistribution is unimodal and approximately symmetric. It is also centred exactly at the true population mean: μ=272.1001. Intuitively, this makes sense. The sample means should tend to ‘‘fall around’’ the population mean.

We can see that the sample mean has some variability around the population mean, which can be quantified using the standard deviation of this distribution of sample means: σx¯=5.837426. The standard deviation of the sample mean tells us how far the typical estimate is away from the actual population mean, 272.1001 minutes. It also describes the typical error of the point estimate, and for this reason we usually call this standard deviation the standard error (SE) of the estimate.



Standard error of an estimate The standard deviation associated with an estimate is called the standard error. It describes the typical error or uncertainty associated with the estimate.

When considering the case of the point estimate x¯, there is one problem: there is no obvious way to estimate its standard error from a single sample. However, statistical theory provides a helpful tool to address this issue.

Example 2.7.3

(a) Would you rather use a small sample or a large sample when estimating a parameter? Why? (b) Using your reasoning from (a), would you expect a point estimate based on a small sample to have smaller or larger standard error than a point estimate based on a larger sample?

Answer. (a) Consider two random samples: one of size 10 and one of size 1000. Individual observations in the small sample are highly influential on the estimate while in larger samples these individual observations would more often average each other out. The larger sample would tend to provide a more accurate estimate. (b) If we think an estimate is better, we probably mean it typically has less error. Based on (a), our intuition suggests that a larger sample size corresponds to a smaller standard error. In the sample of 100 runners, the standard error of the sample mean is equal to one-tenth of the population standard deviation: 4.987072=49.87072/10. In other words, the standard error of the sample mean based on 100 observations is equal to

SEx¯=σx¯=σxn=49.87072100=4.987072

where σx is the standard deviation of the individual observations. This is no coincidence. We can show mathematically that this equation is correct when the observations are independent using the probability tools of Math104.



Computing SE for the sample mean Given n independent observations from a population with standard deviation σ, the standard error of the sample mean is equal to SE=σn (2.7) A reliable method to ensure sample observations are independent is to conduct a simple random sample consisting of less than 10% of the population.

There is one subtle issue of Equation (2.7): the population standard deviation is typically unknown. You might have already guessed how to resolve this problem: we can use the point estimate of the standard deviation from the sample. This estimate tends to be sufficiently good when the sample size is at least 30 and the population distribution is not strongly skewed . Thus, we often just use the sample standard deviation s instead of σ. When the sample size is smaller than 30, we will need to use a method to account for extra uncertainty in the standard error. If the skew condition is not met, a larger sample is needed to compensate for the extra skew. These topics are further discussed in Section 2.10.

Example 2.7.4

(a) Would you be more trusting of a sample that has 100 observations or 400 observations? (b) We want to show mathematically that our estimate tends to be better when the sample size is larger. If the standard deviation of the individual observations is 10, what is our estimate of the standard error when the sample size is 100? What about when it is 400? (c) Explain how your answer to (b) mathematically justifies your intuition in part (a).

Answer. (a) Extra observations are usually helpful in understanding the population, so a point estimate with 400 observations seems more trustworthy.
(b) The standard error when the sample size is 100 is given by SE100=10/100=1. For 400: SE400=10/400=0.5. The larger sample has a smaller standard error.
(c) The standard error of the sample with 400 observations is lower than that of the sample with 100 observations. The standard error describes the typical error, and since it is lower for the larger sample, this mathematically shows the estimate from the larger sample tends to be better – though it does not guarantee that every large sample will provide a better estimate than a particular small sample.