We return to the birth weight data in example 6.2.1. The full data set is given in Table 7.1. We will fit the simple linear regression for birth weight with gestational age as explanatory variable,
The response vector and design matrix are
and
Obtain the least squares estimate .
To find we use the formula
From above,
Therefore,
The fitted model for birth weight, given gestational age at birth is,
We can interpret this as follows,
For every additional week of gestation, expected birth weight increases by grams.
If a child was born at zero weeks of gestation, their birth weight would be grams.
Why does the second result not make sense?
Because the matrices involved can be quite large, whether due to a large sample size , a large number of explanatory variables, or both, it is useful to be able to calculate parameter estimates using computer software. In R, we can do this ‘by hand’ (treating R as a calculator), or we can make use of the function lm which will carry out the entire model fit. We illustrate both ways.
Load the data set bwt into R. To obtain the size of the data set,
This tells us that there are 24 subjects and 3 variables. The variables are,
To fit the simple linear regression of the previous example ‘by hand’,
Set up the design matrix,
View results
Child | Gestational Age (wks) | Birth weight (grams) | Gender |
1 | 40 | 2968 | M |
2 | 38 | 2795 | M |
3 | 40 | 3163 | M |
4 | 35 | 2925 | M |
5 | 36 | 2625 | M |
6 | 37 | 2847 | M |
7 | 41 | 3292 | M |
8 | 40 | 3473 | M |
9 | 37 | 2628 | M |
10 | 38 | 3176 | M |
11 | 40 | 3421 | M |
12 | 38 | 2975 | M |
13 | 40 | 3317 | F |
14 | 36 | 2729 | F |
15 | 40 | 2935 | F |
16 | 38 | 2754 | F |
17 | 42 | 3210 | F |
18 | 39 | 2817 | F |
19 | 40 | 3126 | F |
20 | 37 | 2539 | F |
21 | 36 | 2412 | F |
22 | 38 | 2991 | F |
23 | 39 | 2875 | F |
24 | 40 | 3231 | F |
Recall example 6.4.2 in which we investigated the relationship between gas consumption and external temperature. To measure the effect of changes in the external temperature on gas consumption, we fit the multiple linear regression model 6.6. We will allow a different relationship between gas consumption and outside temperature before and after the installation of cavity wall insulation. The model has four regression coefficients
Here is outside temperature and is an indicator variable taking the value 1 after installation.
The data are shown in Table 7.2.
To estimate the parameters by hand, we first set up the response vector and design matrix,
and
Since will be a matrix, it is easier to do our calculations in R. First load the data set gas.
Insulate contains Before or After to indicate whether or not cavity wall insulation has taken place;
Temp contains outside temperature;
Gas contains gas consumption;
Insulate2 contains a 0 or 1 to indicate before (0) or after (1) cavity wall insulation.
To set up the design matrix
Then to obtain ,
Thus the fitted model is
Before cavity wall insulation, when the outside temperature is 0C, the expected gas consumption is 6.85 1000’s cubic feet.
Before cavity wall insulation, for every increase in temperature of 1C, the expected gas consumption decreases by 0.393 1000’s cubic feet.
After cavity wall insulation, for every increase in temperature of 1C, the expected gas consumption decreases by 0.249 1000’s cubic feet.
Where does the figure 0.249 come from?
Substitute into the fitted model; -0.393+0.144 is the overall rate of change of gas consumption with temperature.
What is the expected gas consumption after cavity wall insulation, when the outside temperature is C?
We can alternatively fit this model in R using lm,
We have used an interaction term * between two explanatory variables. Then R includes an intercept, a term for each of the explanatory variables and the interaction between the two explanatory variables. We will look at interactions in more detail later.
The model suggests that cavity wall insulation decreases gas consumption when the outside temperature is 0C. Further, the rate of increase in gas consumption as the outside temperature decreases is less when the cavity wall is insulated. Are these differences significant?
Observation | Insulation | Outside Temp. (C) | Gas consumption |
---|---|---|---|
1 | Before | -0.8 | 7.2 |
2 | Before | -0.7 | 6.9 |
3 | Before | 0.4 | 6.4 |
4 | Before | 2.5 | 6.0 |
5 | Before | 2.9 | 5.8 |
6 | Before | 3.2 | 5.8 |
7 | Before | 3.6 | 5.6 |
8 | Before | 3.9 | 4.7 |
9 | Before | 4.2 | 5.8 |
10 | Before | 4.3 | 5.2 |
11 | Before | 5.4 | 4.9 |
12 | Before | 6.0 | 4.9 |
13 | Before | 6.0 | 4.3 |
14 | Before | 6.0 | 4.4 |
15 | Before | 6.2 | 4.5 |
16 | Before | 6.3 | 4.6 |
17 | Before | 6.9 | 3.7 |
18 | Before | 7.0 | 3.9 |
19 | Before | 7.4 | 4.2 |
20 | Before | 7.5 | 4.0 |
21 | Before | 7.5 | 3.9 |
22 | Before | 7.6 | 3.5 |
23 | Before | 8.0 | 4.0 |
24 | Before | 8.5 | 3.6 |
25 | Before | 9.1 | 3.1 |
26 | Before | 10.2 | 2.6 |
27 | After | -0.7 | 4.8 |
28 | After | 0.8 | 4.6 |
29 | After | 1.0 | 4.7 |
30 | After | 1.4 | 4.0 |
31 | After | 1.5 | 4.2 |
32 | After | 1.6 | 4.2 |
33 | After | 2.3 | 4.1 |
34 | After | 2.5 | 4.0 |
35 | After | 2.5 | 3.5 |
36 | After | 3.1 | 3.2 |
37 | After | 3.9 | 3.9 |
38 | After | 4.0 | 3.5 |
39 | After | 4.0 | 3.7 |
40 | After | 4.2 | 3.5 |
41 | After | 4.3 | 3.5 |
42 | After | 4.6 | 3.7 |
43 | After | 4.7 | 3.5 |
44 | After | 4.9 | 3.4 |