Home page for accesible maths 10 Covariate selection 10.1.1 Where does the F-test come from?

Style control - access keys in brackets

Font (2 3) - + Letter spacing (4 5) - + Word spacing (6 7) - + Line spacing (8 9) - +

10.1 The F test

The F-test gives a formal statistical test to choose between two nested models. It is based on a comparison between the sum of squares for each of the two models.

Suppose that model 1 has $p_{1}$ explanatory variables, model 2 has $p_{2}>p_{1}$ explanatory variables and model 1 is nested in model 2. Let model 1 have design matrix $X$ and parameters $\beta$ ; model 2 has design matrix $A$ and parameters $\gamma$ .

First we show formally that adding additional explanatory variables will always improve model fit, by decreasing the residual sum of squares for the fitted model.

If $SS_{1}=(y-X\hat{\beta})^{\prime}(y-X\hat{\beta})$ and $SS_{2}=(y-A\hat{\gamma})^{\prime}(y-A\hat{\gamma})$ are the residual sums of squares for models 1 and 2 respectively. Then

\displaystyle SS_{2}\leq SS_{1}

Why does this last inequality hold?

Because of the nesting, we can always find a value $\tilde{\gamma}$ such that

\displaystyle X\hat{\beta}=A\tilde{\gamma}.

Recalling the definition of the sum of squares,

	$\displaystyle SS_{2}$	$\displaystyle=(y-A\hat{\gamma})^{\prime}(y-A\hat{\gamma})$
		$\displaystyle\leq(y-A\tilde{\gamma})^{\prime}(y-A\tilde{\gamma})$

by definition of LSE. So by definition of $\tilde{\gamma}$ ,

\displaystyle SS_{2}=(y-X\hat{\beta})^{\prime}(y-X\hat{\beta})=SS_{1}.

To carry out the F-test we must decide whether the difference between $SS_{1}$ and $SS_{2}$ is sufficiently large to merit the inclusion of the additional explanatory variables in model 2.

Consider the following hypothesis test

\displaystyle H_{0}:\text{Model 1 is the best fit}

vs.

\displaystyle H_{1}:\text{Model 2 is the best fit}.

Remark.

We do not say that ‘Model 1 is the true model’ or ‘Model 2 is the true model’. All models, be they probabilistic or deterministic, are a simplification of real life. No model can exactly describe a real life process. But some models can describe the truth ‘better’ than others. George Box (1919-), British statistician: ‘essentially, all models are wrong, but some are useful’.

To test $H_{0}$ against $H_{1}$ , first calculate the test statistic

{mdframed}

\displaystyle F=\frac{(SS_{1}-SS_{2})/(p_{2}-p_{1})}{SS_{2}/(n-p_{2})}.

(10.1)

Now compare the test statistic to the $F_{p_{2}-p_{1},n-p_{2}}$ distribution, and reject $H_{0}$ if the test statistic exceeds the critical value (equivalently if the $p$ -value is too small).

The critical value from the $F_{p_{2}-p_{1},n-p_{2}}$ distribution can either be evaluated in R, or obtained from statistical tables.

TheoremExample 10.1.1 Brain weights cont.

We proposed three models for log (brain weight) with the following explanatory variables:

1

log(body weight)
2

hours sleep per day
3

log(body weight)+hours sleep per day

Which of these models can we use the $F$ -test to decide between?

The $F$ -test does not allow us to choose between models L1 and L2, since these are not nested. However, it does give us a way to choose between either the pair L1 and L3, or the pair L2 and L3.

To choose between L1 and L2, we use a more ad hoc approach by looking to see which of the explanatory variables is ‘more significant’ than the other when we test

\displaystyle H_{0}:\beta_{2}=0

vs.

\displaystyle H_{1}:\beta_{2}\neq 0.

Using summary(L1) and summary(L2), we see that the $p$ -value for $\beta_{2}$ in L1 is $<2e^{-16}$ and for $\beta_{2}$ in L2 is $4.30e^{-06}$ . As we saw earlier, both of these indicate highly significant relationships between the response and the explanatory variable in question.

Which of the single covariate models is preferable?

Since the $p$ -value for log(body weight) in model L1 is lower, our preferred single covariate model is L1.

We can now use the $F$ -test to choose between our preferred single covariate model L1 and the two covariate model L3,

\displaystyle H_{0}:\text{L1 is the best fit}

vs.

\displaystyle H_{1}:\text{L3 is the best fit}.

We first find the sum of squares for both models. For L1, using the definition of the least squares,

\displaystyle SS(L1)=\sum_{i=1}^{58}\hat{\epsilon}_{i}^{2}=\sum_{i=1}^{58}(y_{% i}-\hat{\beta}_{1}-\hat{\beta}_{2}x_{i,1})^{2}=\sum_{i=1}^{58}(y_{i}-2.15-0.75% 9x_{i,1})^{2}.

To calculate this in R,

⬇

> sum(L1$residuals^2)

[1] 28.00023

So $SS(L1)=28.0$ .

For L3,

	$\displaystyle SS(L3)$	$\displaystyle=\sum_{i=1}^{58}\hat{\epsilon}_{i}^{2}=\sum_{i=1}^{58}(y_{i}-\hat% {\beta}_{1}-\hat{\beta}_{2}x_{i,1}-\hat{\beta}_{3}x_{i,2})^{2}$
		$\displaystyle=\sum_{i=1}^{58}(y_{i}-2.60-0.728x_{i,1}-(-0.0386)x_{i,2})^{2}.$

To calculate this in R,

⬇

> sum(L3$residuals^2)

[1] 26.70658

So $SS(L3)=26.7$ .

Next, we find the degrees of freedom for the two models. Since $n=58$ ,

1

L1 has $p_{1}=2$ regression coefficients, so the degrees of freedom are $n-p_{1}=58-2=56$ .
2

L3 has $p_{2}=3$ regression coefficients, so the degrees of freedom are $n-p_{2}=58-3=55$ .

Finally we calculate the $F$ -statistic given in equations (10.1),

	$\displaystyle F$	$\displaystyle=\frac{[SS(L1)-SS(L3)]/(p_{2}-p_{1})}{SS(L3)/(n-p_{2})}$
		$\displaystyle=\frac{(28.0-26.7)/(3-2)}{26.7/(58-3)}$
		$\displaystyle=1.29/0.486$
		$\displaystyle=2.67.$

The test statistic $F=2.67$ is then compared to the $F$ distribution with $(p_{2}-p_{1},n-p_{2})=(1,55)$ degrees of freedom. From tables, the critical value is just above 4.00; from R it is 4.02.

What can we conclude from this?

Since $2.67<4.00$ , we conclude that there is no evidence to reject $H_{0}$ . There is no evidence to choose the more complicated model and so the best fitting model is L1.

Remark.

We should not be too surprised by this result, since we have already seen that the coefficient for total sleep time is not significantly different to zero in model L3. Once we have accounted for body weight, there is no extra information in total sleep time to explain any remaining variability in brainweights.

10.1.1 Where does the F-test come from?