The following tests allow us to reach a conclusion about whether or not a population mean takes a particular (range of) value(s). For the tests described here, we assume that the data are a sample from a Normal distribution.
In fact, provided that the sample is large enough and the data are continuous with finite variance, this assumption need not hold as the Central Limit Theorem can be used as a justification for a Normal sampling distribution. However, if the data are significantly skewed, then the mean may not be a sensible summary parameter. Can you think why? What might be a better summary of the location of the data?
Suppose that data are assumed to be realisations of IID random variables with distribution. We want to test the null hypothesis
against one of three possible alternatives,
;
;
.
The first two of these are one-tailed hypotheses; the third is a two-tailed hypothesis. The following algorithm describes the one-sample -test which can be used to do exactly this.
Calculate the sample mean and sample variance of the data.
Calculate the test statistic
Compare the test statistic to Student’s -distribution with degrees of freedom.
How, and more importantly why should we compare the test statistic to the -distribution? First it is important to realise that is a realisation of an estimator:
Had we observed a different sample of size , we would also have obtained a different value for the test statistic . Therefore the estimator must have a sampling distribution. We can described this sampling distribution under the assumption that the null hypothesis is true. By comparing the observed test statistic to this null sampling distribution, we can then see how likely it is that the sample was drawn from a population with mean given by the null value . In particular:
is rejected if the observed test statistic is unusually high (or low) when compared to this sampling distribution.
To compare the test statistic to the sampling distribution, we can either
Find a critical region;
Calculate a -value;
Use a confidence interval.
Regardless of the method, we first decide on the significance level of the test.
The significance level is the probability that the null hypothesis is rejected when it is in fact true. It should be taken to be small, usually we test at either the 5% or the 1% level.
To find a critical region:
If , find the quantile of the sampling distribution and reject if the test statistic is greater than this quantile.
If , find the quantile of the sampling distribution and reject if the test statistic is less than this quantile. Due to the symmetry of the -distribution this is the same as finding the quantile and then rejecting if the modulus of the test statistic lies above it.
If , find the and quantiles of the sampling distribution and reject if the test statistic is either greater than the higher quantile, or less than the lower quantile.
We illustrate the method in the following example, which uses the sea ice extent data seen in Chapter 2.
Use the following sample of annual minimum sea ice extents to decide whether or not the average minimum sea ice extent in the Arctic is less than 6.5 million : 4.55, 5.05, 6.48, 5.62, 6.89, 7.52, 6.40, 6.16, 5.32, 6.61.
Calculate the sample mean and the sample variance .
Calculate the test statistic
Find the critical value, that is the 95% quantile of the -distribution with 9 degrees of freedom. From tables, this is 1.833. Alternatively, in R,
Since (see Figure 3.1), we do not reject and conclude that there is no evidence that the mean sea ice extent is less than 6.5 million .
There are two other ways to test the significance of a test statistic. The first of these involves the calculation of a -value.
Assuming that the null hypothesis is true, the -value is the probability, under repeated sampling, of obtaining an observed test statistic at least as extreme as the one observed.
To find a -value,
If , find .
If , find .
If , find .
Reject if the -value is less than the significance level .
Where .
A common misconception is that the -value is the probability that the null hypothesis is true! This is not the case.
For the sample of sea ice extent data given in example 3.2.1, calculate a -value to test whether the mean sea ice extent is less than 6.5 million . Test at the 5% level.
The following is a sample of total November rainfalls (mm) for the city of Durham (UK): 85.6, 60.8, 28.3, 45.6, 116.8, 21.1, 18.8, 62.0, 59.5, 63.4, 52.7, 25.0, 12.2, 61.9, 35.8, 71.6 .
Test the following hypotheses for the mean November rainfall,
vs.
Use a 5% significance level.
We assume that our data are an IID sample from a Normal distribution with mean and variance (how can you check the normality assumption?).
Calculate the sample mean and sample variance .
Calculate the test statistic,
To find a critical region, calculate the 97.5% quantile of the distribution,
This gives 2.13. Since , we reject and conclude that there is evidence that the mean November rainfall is not equal to 70mm.
To obtain a -value, we must calculate where . In R,
This gives , and so again we would reject .
A special case of the -test is the -test. When the population variance is known, the test statistic used is
and the sampling distribution is replaced by the distribution.
As the degrees of freedom of the -distribution tend towards infinity, the -distribution gets closer and closer to the distribution, as shown in Math230. In the context of the -test, since the degrees of freedom are determined by the sample size , as gets very large the distribution can be used in place of the -distribution even when the population variance is unknown. In practice we would probably choose to make the Normal to approximation if was greater than 100.