MATH235

MATH235 Week 5 - Workshop problems

If not all of the problems below are discussed in the workshop for lack of time, then please have a go at the problems on your own.

WS5.1 

We return to the Childhood Respiratory Disease Study, first introduced in Question Sheet 2. Recall that the response variable is FEV (forced expiratory volume, litres) and the explanatory variables are age (years), height (inches), gender (male/female) and smoker (yes/no). Consider three models for log FEV,

𝔼[logFEVi]=β1+β2agei,
𝔼[logFEVi]=β1+β2heighti,
𝔼[logFEVi]=β1+β2heighti+β3I[malei],

where I[malei] is an indicator function, taking the value 1 if individual i is male, and 0 otherwise.

  1. (a)

    Which of models FEV1–FEV3 are nested? Which are not nested? Explain your answer.

  2. (b)

    Now focus on model FEV3. Using the data in Table 2 to fit this model,

    σ^2=0.0200and(XX)-1=[8.01166-0.12677-0.09828-0.126770.00204-0.00072-0.09828-0.000720.22003]
    • (i)

      What is the variance matrix for the least squares estimator β^=(β^1,β^2,β^3)? What is the variance of the intercept term?

    • (ii)

      Given that β^=(-1.97,0.046,0.024), find a 95% confidence interval for β^3.

    • (iii)

      Use your confidence interval from part (iii) to test at the 5% level whether there is evidence for a significant gender effect. As usual, you should state your hypotheses and conclusions.

WS5.2 

Consider the model

𝔼[Yi]=β1+β2xi,1+β3xi,2

where xi,1 and xi,2 are indicator functions for a two-level factor, i.e. xi,1 takes the value 1 if individual i has level 1, and 0 otherwise, whereas xi,2 takes the value 1 if individual i has level 2, and 0 otherwise. Out of n individuals, the first n1 take level 1 of the factor and the remaining n2=n-n1 take level 2.

  1. (a)

    Write down the design matrix X for this model.

  2. (b)

    Calculate XX, and explain why this matrix is singular (non-invertible).

  3. (c)

    Explain what this result tells us about how factors should be included in a linear model.