Home page for accesible maths 10.1 The F test 10.1 The F test 10.2 Link to one-way ANOVA

Style control - access keys in brackets

Font (2 3) - + Letter spacing (4 5) - + Word spacing (6 7) - + Line spacing (8 9) - +

10.1.1 Where does the F-test come from?

From Section 7.2.1, the sum of squares, divided by the degrees of freedom, is an unbiased estimator of the residual variance,

\displaystyle\mathbb{E}[SS/(n-p)]=\sigma^{2}.

Alternatively,

\displaystyle\mathbb{E}[SS]=(n-p)\sigma^{2}.

So if both model 1 and model 2 fit the data then both of their normalised sums of squares are unbiased estimates of $\sigma^{2}$ , and the expected difference in their sums of squares is,

\displaystyle\mathbb{E}[SS_{1}-SS_{2}]=\mathbb{E}[SS_{1}]-\mathbb{E}[SS_{2}]=(% n-p_{1})\sigma^{2}-(n-p_{2})\sigma^{2}=(p_{2}-p_{1})\sigma^{2}.

and $(SS_{1}-SS_{2})/(p_{2}-p_{1})$ is also an unbiased estimator of the residual variance $\sigma^{2}$ .

But if model 1 is not a sufficiently good model for the data

\displaystyle\mathbb{E}[(SS_{1}-SS_{2})/(p_{2}-p_{1})]>\sigma^{2}

since the expected sum of squares for model 1 will be greater than $\sigma^{2}$ as the model does not account for enough of the variability in the response.

It follows that the $F$ -statistic

\displaystyle F=\frac{(SS_{1}-SS_{2})/(p_{2}-p_{1})}{SS_{2}/(n-p_{2})}.

is simply the ratio of two estimates of $\sigma^{2}$ . If model 1 is a sufficient fit, this ratio will be close to 1, otherwise it will be greater than 1.

To see how far the $F$ -ratio must be from from 1 for the result not to have occurred by chance, we need its sampling distribution. It turns out that the appropriate distribution is the $F_{(p_{2}-p_{1}),(n-p_{2})}$ distribution. The proof of this is too long to cover here.