We can start the evaluation of the hypothesis setup by comparing 2009 and 2013 run times using a point estimate from the 2013 sample: minutes. This estimate suggests the average time is actually longer than the 2009 time, 272.5002 minutes. However, to evaluate whether this provides strong evidence that there has been a change, we must consider the uncertainty associated with .
We learned in Section 2.7 that there is fluctuation from one sample to another, and it is very unlikely that the sample mean will be exactly equal to our parameter; we should not expect to exactly equal . Given that , it might still be possible that the population average in 2013 has remained unchanged from 2009. The difference between and 272.5002 could be due to sampling variation, i.e. the variability associated with the point estimate when we take a random sample.
In Section 2.8, confidence intervals were introduced as a way to find a range of plausible values for the population mean. Based on LonMar13Samp, a 95% confidence interval for the 2013 population mean, , was calculated as
Because the 2009 mean, 272.5002, falls in the range of plausible values, we cannot say the null hypothesis is implausible. That is, we failed to reject the null hypothesis, .
TIP: Double negatives can sometimes be used in statistics
In many statistical explanations, we use double negatives. For instance, we might say that the null
hypothesis is not implausible or we failed to reject the null hypothesis. Double
negatives are used to communicate that while we are not rejecting a position, we are also not
saying it is correct.
Universities frequently provide estimates of student expenses such as housing. The NUS claims that the average student housing expense is £123.87 per week. What are the null and alternative hypotheses to test whether this claim is accurate?
Answer. : The average cost is £123.87 per week, .
: The average cost is different than £123.87 per week, .
Lancaster University decides to collect data to evaluate the £123.87 per month claim. They take a random sample of 75 students at their University and obtain the data represented in Figure LABEL:communityCollegeClaimedHousingExpenseDistribution. Can we apply the normal model to the sample mean?
R> data(housing)
R> hist(housing,breaks=10)
Answer. Applying the normal model requires that certain conditions are met. Because the data are a simple random sample and the sample (presumably) represents no more than 10% of all students at the University, the observations are independent. The sample size is also sufficiently large () and the data exhibit only moderate skew. Thus, the normal model may be applied to the sample mean.
The sample mean for student housing is £115.83 and the sample standard
deviation is £34.46. Construct a 95% confidence interval for the population mean and
evaluate the hypotheses of Exercise 2.9.2.
R> mean(housing);sd(housing)
Answer. The standard error associated with the mean may be estimated using the sample standard deviation divided by the square root of the sample size. Recall that students were sampled.
Answer. You showed in Exercise 2.9.3 that the normal model may be applied to the sample mean. This ensures a 95% confidence interval may be accurately constructed:
Because the null value £123.87 is not in the confidence interval, a true mean of £123.87 is implausible and we reject the null hypothesis. The data provide statistically significant evidence that the actual average housing expense is less than £123.87 per month.