MATH235

MATH235 Week 3 - Assessed problems (coursework)

Submission is due on Tuesday in Week 4.

CW3.1 

For the Olympics data in Table 0.12, a linear regression model in which long jump distance is used to predict high jump height is proposed,

Yi=β1+β2xi+ϵi,i=1,,n,

where Yi is high jump height and xi is long jump distance.

Observation Year High Jump Long Jump Discus Throw
1 1896 71.25 249.750 1147.50
2 1900 74.80 282.875 1418.90
3 1904 71.00 289.000 1546.50
4 1952 80.32 298.000 2166.85
5 1960 85.00 319.750 2330.00
6 1964 85.75 317.750 2401.50
7 1972 87.75 324.500 2535.00
8 1976 88.50 328.500 2657.40
9 1980 92.75 336.250 2624.00
10 1984 92.50 336.250 2622.00
Table 0.12: Winning Olympic high jump heights and long jump and discus distances for a sample of 10 years. All measurements are in inches.
  1. (a)

    What assumptions are made about the residuals ϵ1,,ϵn?

    [marks: 2]

  2. (b)
    • (i)

      Using the data in Table 0.12, write out the design matrix for this model.

      [marks: 1]

    • (ii)

      Obtain the least squares estimates for β=(β1,β2).

      [marks: 2]

CW3.2 

A survey on five colonies of ants was conducted to assess whether the mean ant length varied between colony. The results are shown in Table 0.13, and can also be found in the dataframe antLengths. Lengths are in mm. Let μi denote the population mean length for colony i.

Ant length (mm) Group
8.1 1
7.7 1
8.1 1
7.8 1
7.6 1
8.2 1
8.0 1
7.6 1
9.9 2
10.0 2
9.9 2
9.7 2
10.0 2
9.5 2
9.1 2
11.4 2
9.9 3
10.0 3
10.3 3
9.6 3
10.6 3
10.6 3
14.4 3
13.6 3
14.3 4
13.5 4
13.0 4
13.7 4
13.9 4
14.8 4
16.7 4
16.5 4
15.3 5
16.3 5
15.3 5
15.9 5
16.0 5
19.2 5
18.1 5
17.1 5
Table 0.13: Ant lengths (mm) for ants from five colonies.
  1. (a)

    Give appropriate null and alternative hypotheses to test whether the mean ant length varies between colonies.

    [marks: 1]

  2. (b)

    Calculate the sample group means for all five groups.

    [marks: 1]

  3. (c)

    Given that the overall mean length of the samples is 12.03mm and that the total sum of squares SST=453.864, carry out a one-way ANOVA to test your hypotheses in part (i). You should test at the 5% significance level, and state your conclusions clearly.

    [marks: 3]

CW3.3 

Challenge
Consider the following extension to the simple linear regression model,

Yi=β1+β2xi+ϵi,i=1.,n

where ϵi are independent but not identically distributed Normal(0,σi2) random variables, where σi=σxi.

  1. (a)

    Standardise Yi to create a random variable Zi which has Normal(0,σ2) distribution.

    [marks: 1]

  2. (b)

    By summing the squares of Z1,,Zn, create a sum of squares function for (β1,β2).

    [marks: 2]

  3. (c)

    Using the first-order partial derivatives of the sum of squares function derived in part (b), show that the least squares estimator of β2 is

    β^2=i=1nxi-1i=1nYi-ni=1nxi-1Yii=1nxii=1nxi-1-n2.

    [marks: 2]