Home page for accesible maths 3.3 Two sample tests 3.3.1 Unpaired data 3.4 Summary

Style control - access keys in brackets

Font (2 3) - + Letter spacing (4 5) - + Word spacing (6 7) - + Line spacing (8 9) - +

3.3.2 Paired data

Suppose that $x_{1},\ldots,x_{n}$ are realisations of IID random variables with $\operatorname{Normal}(\mu_{X},\sigma_{X}^{2})$ distribution and $y_{1},\ldots,y_{n}$ are realisations of IID random variables with $\operatorname{Normal}(\mu_{Y},\sigma_{Y}^{2})$ distribution. Assume that the samples are paired and that the population variances are unknown.

Remark.

By assuming paired samples, the random variables within a pair, $X_{i}$ and $Y_{i}$ are longer independent. To make things easier, we will therefore assume the same covariance for all pairs, i.e. $\operatorname{Cov}(X_{i},Y_{i})=\sigma_{XY}$ for $i=1,\ldots,n$ .

In the case of paired data, to test the null hypothesis

\displaystyle H_{0}:\mu_{X}=\mu_{Y}

against any of the alternatives $H_{1}:\mu_{X}\neq\mu_{Y}$ , $H_{1}:\mu_{X}<\mu_{Y}$ and $H_{1}:\mu_{X}>\mu_{Y}$ , we consider the differences $D_{1},\ldots,D_{n}$ . The above hypothesis can then be rewritten in terms of the population mean $\mu_{D}$ of the differences,

\displaystyle H_{0}:\mu_{D}=0

which is tested against $H_{1}:\mu_{D}\neq 0$ , $H_{1}:\mu_{D}<0$ and $H_{1}:\mu_{D}>0$ respectively.

The algorithm for this test is as follows:

1.

Calculate the differences $d_{i}=x_{i}-y_{i}$ for $i=1,\ldots,n$ .
2.

Calculate the sample mean $\bar{d}$ and sample variance $s^{2}_{d}$ of the differences $d_{1},\ldots,d_{n}$ .
3.

Calculate the test statistic,

$\displaystyle t=\frac{\bar{d}}{s_{d}/\sqrt{n}}.$
4.

Compare this to the $t_{n-1}$ -distribution, either by finding a $p$ -value, or by using an appropriate critical value.

TheoremExample 3.3.2 Drug treatment comparison

This example comes from Chapter 6 of Diggle and Chetwynd (2011). They report results on a clinical trial to compare two anti-congestion drugs used to relieve short-term symptoms of asthma: each individual tests both drugs.

In the asthma drug trial, the maximum rate at which the individual was able to exhale, denoted by PEF (Peak Expiratory Flow), was measured a fixed time after administration of each treatment. The following table shows the results for all 13 participants,

Individual ( $i$ )	1	2	3	4	5	6	7	8	9	10	11	12	13
Drug F ( $x_{i}$ )	310	310	370	410	250	380	330	385	400	410	320	340	220
Drug S ( $y_{i}$ )	270	260	300	390	210	350	365	370	310	380	290	260	90
Differences,	40	50	70	20	40	30	-35	15	90	30	30	80	130
( $d_{i}=x_{i}-y_{i}$ )	40	50	70	20	40	30	-35	15	90	30	30	80	130

For the two drugs F and S, test the following

\displaystyle H_{0}:\mu_{F}=\mu_{S}

vs.

\displaystyle H_{1}:\mu_{F}\neq\mu_{S}.

1.

This is equivalent to testing

$\displaystyle H_{0}:\mu_{D}=0$

vs.

$\displaystyle H_{1}:\mu_{D}\neq 0.$

where $D_{i}=X_{i}-Y_{i}$ .
2.

Calculate the differences by subtracting PEF for drug S from PEF for drug F; these are given in the above table.
3.
Calculate the sample mean and variance of the differences
1. $\bar{d}=\frac{40+50+\ldots+130}{13}=45.38$ ,
2. $s^{2}_{d}=\frac{1}{12}\sum_{i=1}^{13}(d_{i}-45.38)^{2}=1647.8$ .
4.

Find the test statistic

$\displaystyle t=\frac{\bar{d}}{s_{d}/\sqrt{13}}=\frac{45.38}{40.59/3.61}=4.03.$
5.

The $p$ -value for this, since $t>0$ and the alternative hypothesis is two-tailed, is

⬇

> 2*(1-pt(4.03,df=12))

[1] 0.001669197

that is $p=1.67\times 10^{-3}<0.05$ , so we would reject $H_{0}$ at the 5% level and conclude that there is evidence of a difference in the mean PEF between the two drugs.

Remark.

Note that we should always be careful to remove any other which could affect the hypothesis test. In this experiment, administration of the second was separated from administration of the first by a time interval sufficient to ensure that any effects of the first drug have worn off; the order in which the two treatments are given is randomised to remove the potential effect of ordering.