Home page for accesible maths 2.8 Confidence intervals 2.8.3 A sampling distribution for the mean 2.8.5 Interpreting confidence intervals

Style control - access keys in brackets

Font (2 3) - + Letter spacing (4 5) - + Word spacing (6 7) - + Line spacing (8 9) - +

2.8.4 Changing the confidence level

Suppose we want to consider confidence intervals where the confidence level is somewhat higher than 95%: perhaps we would like a confidence level of 99%. Think back to the analogy about trying to catch a fish: if we want to be more sure that we will catch the fish, we should use a wider net. To create a 99% confidence level, we must also widen our 95% interval. On the other hand, if we want an interval with lower confidence, such as 90%, we could make our original 95% interval slightly slimmer.

The 95% confidence interval structure provides guidance in how to make intervals with new confidence levels. Below is a general 95% confidence interval for a point estimate that comes from a nearly normal distribution:

\displaystyle\text{point estimate}\ \pm\ 1.96\times SE

(2.10)

There are three components to this interval: the point estimate, ‘‘1.96’’, and the standard error. The choice of $1.96\times SE$ was based on capturing 95% of the data since the estimate is within 1.96 standard deviations of the parameter about 95% of the time. The choice of 1.96 corresponds to a 95% confidence level.

Example 2.8.5

If $X$ is a normally distributed random variable, how often will $X$ be within 2.58 standard deviations of the mean?

Answer. This is equivalent to asking how often the $Z$ score will be larger than -2.58 but less than 2.58. (For a picture, see Figure LABEL:choosingZForCI.) To determine this probability, look up -2.58 and 2.58 using R ( $\mathbb{P}(Z<-2.58)=0.0049$ and $\mathbb{P}(Z<2.58)=0.9951$ ). Thus, there is a $\mathbb{P}(-2.58<Z<2.58)=0.9951-0.0049\approx 0.99$ probability that the unobserved random variable $X$ will be within 2.58 standard deviations of $\mu$ . To create a 99% confidence interval, change 1.96 in the 95% confidence interval formula to be $2.58$ . Exercise 2.8.5 highlights that 99% of the time a normal random variable will be within 2.58 standard deviations of the mean. This approach – using the Z scores in the normal model to compute confidence levels – is appropriate when $\bar{x}$ is associated with a normal distribution with mean $\mu$ and standard deviation $SE_{\bar{x}}$ . Thus, the formula for a 99% confidence interval is

\displaystyle\bar{x}\ \pm\ 2.58\times SE_{\bar{x}}

(2.11)

The normal approximation is crucial to the precision of these confidence intervals. Section 2.10 provides a more detailed discussion about when the normal model can safely be applied. When the normal model is not a good fit, we will use alternative distributions that better characterize the sampling distribution.

Verifying independence is often the most difficult of the conditions to check, and the way to check for independence varies from one situation to another. However, we can provide simple rules for the most common scenarios.

TIP: How to verify sample observations are independent Observations in a simple random sample consisting of less than 10% of the population are independent.

Conditions for $\bar{x}$ being nearly normal and $S E$ being accurate Important conditions to help ensure the sampling distribution of $\bar{x}$ is nearly normal and the estimate of SE sufficiently accurate: • The sample observations are independent. • The sample size is large: $n\geq 30$ is a good rule of thumb. • The distribution of sample observations is not strongly skewed. Additionally, the larger the sample size, the more lenient we can be with the sample’s skew.

Caution: Independence for random processes and experiments If a sample is from a random process or experiment, it is important to verify the observations from the process or subjects in the experiment are nearly independent and maintain their independence throughout the process or experiment. Usually subjects are considered independent if they undergo random assignment in an experiment.

Confidence interval for any confidence level If the point estimate follows the normal model with standard error $S E$ , then a confidence interval for the population parameter is $\displaystyle\text{point estimate}\ \pm\ z^{\star}SE$ where $z^{\star}$ corresponds to the confidence level selected. We can calculate $z^{\star}$ using pnorm.

Figure LABEL:choosingZForCI provides a picture of how to identify $z^{\star}$ based on a confidence level. We select $z^{\star}$ so that the area between - $z^{\star}$ and $z^{\star}$ in the normal model corresponds to the confidence level.

Margin of error In a confidence interval, $z^{\star}\times SE$ is called the margin of error.

Example 2.8.6

Create a 90% confidence interval for the average time for all runners in the 2013 London Marathon. The point estimate is $\bar{x}=273.4978$ and the standard error is $SE_{\bar{x}}=4.987072$ .

Answer. We first find $z^{\star}$ such that 90% of the distribution falls between - $z^{\star}$ and $z^{\star}$ in the standard normal model, $N(\mu=0,\sigma=1)$ . We can look up - $z^{\star}$ in R by looking for a lower tail of 5% (the other 5% is in the upper tail), qnorm(0.95)=1.644854 $\approx$ 1.645, thus $\mathbb{P}(Z<z^{\star})=0.95$ implies $z^{\star}=1.645$ . The 90% confidence interval can then be computed as $\bar{x}\ \pm\ 1.645\times SE_{\bar{x}}\to(265.2691,281.7265)$ . (We had already verified conditions for normality and the standard error.) That is, we are 90% confident the average time is larger than 265.2691 but less than 281.7265 minutes.