Home page for accesible maths 10.1.1 Where does the F-test come from?10.3 Summary

Style control - access keys in brackets

Font (2 3) - + Letter spacing (4 5) - + Word spacing (6 7) - + Line spacing (8 9) - +

10.2 Link to one-way ANOVA

Recall from Chapter 5 that a one-way ANOVA is a method for comparing the group means of three or more groups; an extension of the unpaired $t$ -test.

It turns out that the one-way ANOVA is a special case of a simple linear model, in which the explanatory variable is a factor with three or more levels, where each level represents membership of one of the groups.

Suppose that the factor has $m$ -levels, then the linear model for a one-way ANOVA can be written as

\displaystyle\mathbb{E}[Y_{i}]=\beta_{1}x_{i,1}+\beta_{2}x_{i,2}+\ldots+\beta_% {m}x_{i,m}

where $x_{i,j}$ is the indicator variable for the $j$ -th level of the factor.

The purpose of an ANOVA is to test whether the mean response varies between different levels of the factor. This is equivalent to testing

\displaystyle H_{0}:\beta_{1}=\beta_{2}=\cdots=\beta_{m}

vs.

\displaystyle H_{1}:\beta_{1}\neq\beta_{2}\neq\cdots\neq\beta_{m}.

In turn, this is equivalent to a model selection between

1

$H_{0}$ : Model 1, where $\mathbb{E}[Y_{i}]=\beta_{1}$ ; and
2

$H_{1}$ : Model 2, where $\mathbb{E}[Y_{i}]=\beta_{1}x_{i,1}+\beta_{2}x_{i,2}+\ldots+\beta_{m}x_{i,m}$ .

Now, for model 1 states that all responses share a common population mean, our design matrix is simply a column of 1’s and $\hat{\beta}_{1}=\bar{y}$ , the overall sample mean. For model 2, the design matrix has $m$ columns, with

X_{i,j}=\left\{\begin{array}[]{ll}1&\quad\text{if individual }i\text{ is in % group }j\\ 0&\quad\text{otherwise}\end{array}\right.

Therefore $X^{\prime}X$ is an $m\times m$ diagonal matrix, the diagonal entries of which correspond to the number of individuals in each of the groups,

(X^{\prime}X)_{j,j}=n_{j},

$j=1,\dots,m$ , and $X^{\prime}y$ is a vector of length $m$ , with $j$ -th entry being the sum of all the responses in group $j$ . It follows that

	$\displaystyle\hat{\beta}_{j}$	$\displaystyle=[(X^{\prime}X)^{-1}X^{\prime}y]_{j}$
		$\displaystyle=\frac{1}{n_{j}}\sum_{i=1}^{n}y_{i}I[\text{individual }i\text{ is% in group }j]$
		$\displaystyle=\bar{y}_{j}$

i.e. the least squares estimate of the $j$ -th regression coefficient is the observed mean of that group.

Calculating the sums of squares for the two models, we have

\displaystyle SS_{1}=\sum_{i=1}^{n}(y_{i}-\bar{y})^{2}

which, in ANOVA terminology, is what we referred to has the ‘total sum of squares’, and

\displaystyle SS_{2}=\sum_{i=1}^{n}(y_{i}-\bar{y}_{1}x_{i,1}-\ldots-\bar{y}_{m% }x_{i,m})^{2}

which, in ANOVA terminology, is what we referred to as the within groups sum of squares.

Consequently, the $F$ -ratio for model selection can be shown to be identical to the test statistic used for the one-way ANOVA:

	$\displaystyle F$	$\displaystyle=\frac{(SS_{1}-SS_{2})/(m-1)}{SS_{2}/(n-m)}$
		$\displaystyle=\frac{(SS_{T}-SS_{W})/(m-1)}{SS_{W}/(n-m)}$
		$\displaystyle=\frac{SS_{B}/(m-1)}{SS_{W}/(n-m)}$
		$\displaystyle=\frac{MS_{B}}{MS_{w}}.$