Submission is due on Tuesday in Week 5.
In this question, we use the data in Table 1 of Question Sheet 1 to investigate whether or not the annual price of tea can be used to predict the annual price of sugar.
Write down an appropriate linear model to relate the annual price of tea to the annual price of sugar. You should include an intercept term, and define all your variables.
[marks: 2]
Fit the model described in part (a) and report the least squares estimates of your regression coefficients.
[marks: 1]
By obtaining a vector of estimated residuals, give an estimate of the residual variance .
[marks: 2]
The full FEV data set first discussed in Question Sheet 2 is available in the file fev. The data frame in this file has 654 records and six variables, including age (years), FEV (litres), height (inches), gender (1 for male, 0 for female), smoker (0 for no, 1 for yes) and log FEV.
Consider the following linear regression model,
where is FEV, is a height and is age.
Use the function lm to fit this model to the full FEV data set. What are your least squares estimates ?
[marks: 2]
For this model and data set,
and the estimated residual variance is .
Using these results, is there evidence of a linear relationship between log FEV and height? State clearly your hypotheses and conclusions.
[marks: 3]
Challenge
Generalised least squares is a technique which can be used to estimate the regression coefficients when the residuals in a linear regression model have unequal variances and/or are correlated with each other. In this case a matrix of weights is defined where
The generalised least squares estimator for is .
Show that
The generalised least squares estimator is unbiased.
[marks: 1]
The variance of is
[marks: 2]
Show that in the case of uncorrelated residuals with equal variances the generalised least squares estimator reduces to .
[marks: 2]