Home page for accesible maths 3 Hypothesis testing

Style control - access keys in brackets

Font (2 3) - + Letter spacing (4 5) - + Word spacing (6 7) - + Line spacing (8 9) - +

3.2 One sample tests

The following tests allow us to reach a conclusion about whether or not a population mean takes a particular (range of) value(s). For the tests described here, we assume that the data are a sample from a Normal distribution.

In fact, provided that the sample is large enough and the data are continuous with finite variance, this assumption need not hold as the Central Limit Theorem can be used as a justification for a Normal sampling distribution. However, if the data are significantly skewed, then the mean may not be a sensible summary parameter. Can you think why? What might be a better summary of the location of the data?

Suppose that data x1,,xn are assumed to be realisations of IID random variables X1,,Xn with Normal(μ,σ2) distribution. We want to test the null hypothesis

H0:μ=μ0

against one of three possible alternatives,

  1. 1.

    H1:μ>μ0;

  2. 2.

    H1:μ<μ0;

  3. 3.

    H1:μμ0.

The first two of these are one-tailed hypotheses; the third is a two-tailed hypothesis. The following algorithm describes the one-sample t-test which can be used to do exactly this.

{mdframed}
  1. 1.

    Calculate the sample mean x¯ and sample variance s2 of the data.

  2. 2.

    Calculate the test statistic

    t=x¯-μ0s/n.
  3. 3.

    Compare the test statistic t to Student’s t-distribution with n-1 degrees of freedom.

How, and more importantly why should we compare the test statistic to the tn-1-distribution? First it is important to realise that t is a realisation of an estimator:

T=X¯-μ0S/n.

Had we observed a different sample of size n, we would also have obtained a different value for the test statistic t. Therefore the estimator T must have a sampling distribution. We can described this sampling distribution under the assumption that the null hypothesis is true. By comparing the observed test statistic to this null sampling distribution, we can then see how likely it is that the sample was drawn from a population with mean given by the null value μ0. In particular:

  • H0 is rejected if the observed test statistic t is unusually high (or low) when compared to this sampling distribution.

  • To compare the test statistic t to the sampling distribution, we can either

    1. (a)

      Find a critical region;

    2. (b)

      Calculate a p-value;

    3. (c)

      Use a confidence interval.

Regardless of the method, we first decide on the significance level α of the test.

Definition.

The significance level is the probability that the null hypothesis is rejected when it is in fact true. It should be taken to be small, usually we test at either the 5% or the 1% level.

{mdframed}

To find a critical region:

  • If H1:μ>μ0, find the (100-α)% quantile of the sampling distribution and reject H0 if the test statistic is greater than this quantile.

  • If H1:μ<μ0, find the α% quantile of the sampling distribution and reject H0 if the test statistic is less than this quantile. Due to the symmetry of the t-distribution this is the same as finding the (100-α)% quantile and then rejecting H0 if the modulus of the test statistic lies above it.

  • If H1:μμ0, find the α/2 and (100-α/2)% quantiles of the sampling distribution and reject H0 if the test statistic is either greater than the higher quantile, or less than the lower quantile.

We illustrate the method in the following example, which uses the sea ice extent data seen in Chapter 2.

TheoremExample 3.2.1

Use the following sample of annual minimum sea ice extents to decide whether or not the average minimum sea ice extent in the Arctic is less than 6.5 million km2: 4.55, 5.05, 6.48, 5.62, 6.89, 7.52, 6.40, 6.16, 5.32, 6.61.

  1. 1.

    Calculate the sample mean x¯=6.06 and the sample variance s2=0.8296.

  2. 2.

    Calculate the test statistic

    t=x¯-6.5σ/n=6.06-6.50.8296/10=-1.53.
  3. 3.

    Find the critical value, that is the 95% quantile of the t-distribution with 9 degrees of freedom. From tables, this is 1.833. Alternatively, in R,

    > qt(0.95,9)
  4. 4.

    Since t=|-1.53|<1.833 (see Figure 3.1), we do not reject H0 and conclude that there is no evidence that the mean sea ice extent is less than 6.5 million km2.

Fig. 3.1: The density of the t9 sampling distribution for the test statistic in the sea ice example. The critical region for the test at the 5% level is marked in blue using either the true value of the test statistic (left) or the modulus of the test statistic (right). The red lines show the test statistic (left) and the modulus of the test statistic (right).

There are two other ways to test the significance of a test statistic. The first of these involves the calculation of a p-value.

Definition.

Assuming that the null hypothesis is true, the p-value is the probability, under repeated sampling, of obtaining an observed test statistic at least as extreme as the one observed.

{mdframed}

To find a p-value,

  • If H1:μ>μ0, find p=Pr[T>t].

  • If H1:μ<μ0, find p=Pr[T<t].

  • If H1:μμ0, find p=2Pr[T>|t|].

Reject H0 if the p-value is less than the significance level α.

Where Ttn-1.

Remark.

A common misconception is that the p-value is the probability that the null hypothesis is true! This is not the case.

TheoremExample 3.2.2

For the sample of sea ice extent data given in example 3.2.1, calculate a p-value to test whether the mean sea ice extent is less than 6.5 million km2. Test at the 5% level.

  1. 1.

    From example 3.2.1, the test statistic is t=-1.53.

  2. 2.

    Since H1:μ<6.5, the p-value is p=Pr[T<-1.53], under the assumption that T follows a t9-distribution. In R,

    pt(-1.53,9)

    which gives p=0.0802.

  3. 3.

    Since 0.0802>0.05, we do not reject H0 and conclude that there is no evidence that the mean sea ice extent is less than 6.5 million km2.

TheoremExample 3.2.3

The following is a sample of total November rainfalls (mm) for the city of Durham (UK): 85.6, 60.8, 28.3, 45.6, 116.8, 21.1, 18.8, 62.0, 59.5, 63.4, 52.7, 25.0, 12.2, 61.9, 35.8, 71.6 .

Test the following hypotheses for the mean November rainfall,

H0:μ=70

vs.

H1:μ70.

Use a 5% significance level.

We assume that our data x1,,x16 are an IID sample from a Normal distribution with mean μ and variance σ2 (how can you check the normality assumption?).

  1. 1.

    Calculate the sample mean x¯=51.3 and sample variance s2=760.

  2. 2.

    Calculate the test statistic,

    t=x¯-μ0s/n=51.3-70760/16=-2.71.
  3. 3.

    To find a critical region, calculate the 97.5% quantile of the t15 distribution,

    > qt(0.975,df=15)

    This gives 2.13. Since |-2.71|=2.71>2.13, we reject H0 and conclude that there is evidence that the mean November rainfall is not equal to 70mm.

  4. 4.

    To obtain a p-value, we must calculate Pr[T>|-2.71|] where Tt15. In R,

    > 2*(1-pt(2.71,df=15))

    This gives 0.0161<0.05, and so again we would reject H0.

Remark.

A special case of the t-test is the z-test. When the population variance σ2 is known, the test statistic used is

z=x¯-μ0σ/n

and the tn-1 sampling distribution is replaced by the Normal(0,1) distribution.

Remark.

As the degrees of freedom of the t-distribution tend towards infinity, the t-distribution gets closer and closer to the Normal(0,1) distribution, as shown in Math230. In the context of the t-test, since the degrees of freedom are determined by the sample size n, as n gets very large the Normal(0,1) distribution can be used in place of the t-distribution even when the population variance is unknown. In practice we would probably choose to make the Normal to t approximation if n was greater than 100.