6 Linear predictor and model formula

6.4 Interaction

Definition 6.4.1.

An interaction may arise when the relationship of two or more explanatory variables with the responses is not just additive.

Consider two explanatory variables 𝐱1=(x1,1,,x1,n) and 𝐱2=(x2,1,,x2,n). We have seen that the additive model can be written as:

ηi=β0+β1x1,i+β2x2,i.

However, the two variables can interact with one another in a number of complex ways. For example:

ηi=β0+β1x1,i+β2x2,i+β3exp{x1,i+sin(x1,ix2,i)}

It is extremely difficult to justify such a complex form, so we often limit ourselves to the following simple form to model interactions:

ηi=β0+β1x1,i+β2x2,i+β3x1,ix2,i

If X1=span(𝟏,𝐱1) and X2=span(𝟏,𝐱2) are two corresponding spans then the additive model states 𝜼X1+X2=span(𝟏,𝐱1,𝐱2). When we wish to include interactions, the linear predictor belongs to the product of the two subspaces, ηX1.X2=span(𝟏,𝐱1,𝐱2,𝐱1.𝐱2) where 𝐱1.𝐱2 is the element product of the two vectors.

6.4.1 Interaction between categorical variables

Suppose A represents two species of potato and B two varieties of fertilizer. Suppose the true yield μ=η is measured under these four conditions and gives

 Fertilizer B1B2 Potato A12022A22527

This converts to the following array

𝜼𝟏𝐚2𝐛2𝐚2𝐛2201000221010251100271111

and with these values the linear predictor is

𝜼=20 1+5𝐚2+2𝐛2+0𝐚2𝐛2.

The distinguishing feature here is that the increase in yield due to species A2 compared to species A1 is the same at fertilizer level B1, (25-20=5), as at fertilizer B2, (27-22=5). Similarly the increase due to fertilizer B2 over fertilizer B1 is the same for both species (22-20=2=27-25). This makes it possible to talk about a species effect without having to specify which fertilizer is used. And a fertilizer effect without specifying which species.

 
Exercise 6.51
Alternatively suppose that yield has been given by

 Fertilizer B1B2 Potato A12022A22526

Find a similar expression for 𝜼 and interpret.

 

The second example exhibits interaction between A and B whereas the former does not. With just two levels, A=span(𝟏,𝐚2) and B=span(𝟏,𝐛2), we can define the additive subspace A+B=span(𝟏,𝐚2,𝐛2) for the first example and the product subspace A.B=span(𝟏,𝐚2,𝐛2,𝐚2𝐛2) which contains the interaction 𝐚2𝐛2 for the second.

6.4.2 Interaction of categorical and numerical variables

Consider the example earlier in this chapter regarding the relationship of weight with height among a population of school children. Here, the numerical explanatory variable is height and the categorical variable is gender. Let G=span(𝟏,𝐠) be the factor subspace for gender and H=span(𝟏,𝐡) be the subspace for height.

Earlier we examined the additive model:

𝜼=β0+β1𝐡+β2𝐠G+H=span(𝟏,𝐡,𝐠).

For male children, the linear predictor increase by β2 irrespective of height, representing a shift in the intercept parameter.

The interaction model includes the elementwise product of vectors:

𝜼=β0+β1𝐡+β2𝐠+β3𝐠𝐡G.H=span(𝟏,𝐡,𝐠,𝐠𝐡).

 
Exercise 6.52
Describe the meaning of β3 in the interaction model.