The linear predictor is a linear combination of the explanatory variables that is defined by . A more concise notation for the linear predictor suppresses reference to the coefficients of the combination, and write
The specification , or in vector form , can be written as
Now, it may be that the relationship between expected response and explanatory variable is more complicated than this. A reasonable procedure might be to see if enlarging the model to include a quadratic term improves the fit, i.e.
where .
The notation for these subspaces can be streamlined by writing and . (Notation care: is often a random variable, or a design matrix, as well.) The quadratic model above can be written as follows by the sum of two subspaces:
In general if , then the model is equivalent to The reason for requiring for each is concerned with indicator variables and will emerge later. This notation highlights the view of linear models as the specification of a subspace to which the linear predictor belongs.
Standard models
Model | (model formula) |
---|---|
Simple linear regression | |
Quadratic regression | |
Polynomial regression | |
Regression through the origin | |
Multiple regression | |
Multiple regression | . |
The degrees of freedom of model is where is the minimum number of vectors required to and is the number of observations.
Exercise 6.47
Consider the example of predicting timber volume from measured tree height and trunk radius.
Define a linear predictor based on the volume of a cylinder.
If a tree is tall then it must have a wide trunk to support its height. Knowledge of one provides insight about the other, so the variables are likely to be correlated. How does this influence how to define the linear predictor?
Lattice diagrams provide a convenient format to summarise which models have been fitted. The diagram below gives all submodels for a linear predictor based on three variables.
Unnumbered Figure: Link
With an increasing number of variables these lattices rapidly become complex.