3 The exponential family

3.7 How Y changes with covariates x

There are three notations for GLMS: generic, index, vector. We use all three, for instance, in the specification of the link function and the linear predictor.

Linear predictor: The explanatory variables influence the distribution of Y through a single linear function, the linear predictor, which can be described as:

η=β1x1+β2x2++βpxp(generic)orηi=β1x1i+β2x2i++βpxpi(index)or𝜼=Xβ.(vector)

Link function: The mean, μi of Yi, and linear predictor are related by a smooth invertible function g() called the link function:

g(μ)=η,org(μi)=ηiorg(𝝁)=𝜼.

Unnumbered Figure: Link

Linearity of the linear predictor

The linear predictor η contain continuous variables such as x but also known functions of these, e.g.x2, log(x), combined as linear combinations, e.g.αx+2x2-βlog(x). The technical definition of linearity in GLMs refers to the linearity of η in the parameters, rather than in the explanatory variables themselves.

The linear predictor may also contains discrete variables, and importantly, indicator variables. For example, to differentiate between three groups red, white and purple, let

  • ai=1 if i-th member is red, and 0 otherwise,

  • bi=1 if i-th member is white, and 0 otherwise,

  • ci=1 if i-th member is purple, and 0 otherwise.

Linear combinations of these indicator variables, α𝐚+β𝐛+γ𝐜, in the linear predictor indicate to which group the unit belongs, and then the effect on the response. The effect of being red adds α units to the mean response.

Causal interpretation

The key to any interpretation of the fitted model the way in which the mean of the response μ=𝔼[Y] changes with changes in the explanatory variables. The standard interpretation is to calculate the change in the expected response given a change in the explanatory variable xj, holding all other variables constant. Thus the partial derivative μxj is an important part of this relationship which is mediated through the linear predictor and the link function.

Using the chain rule:

μxj=dμdηηxj=(dg(μ)dμ)-1ηxj=βjg(μ),

where g(μ)=dgdμ and μ is a function of the coefficients β1,,βp. If βj=0 then changes in xj lead to no change in the expected response, as long as other variables are held constant. When βj is not zero the actual change depends on the value of μ through the derivative of the link function.

 
Exercise 3.35
The function that relates the canonical parameter to the mean parameter for the Bernoulli distribution is θ=logit(μ). Suppose the link function (that relates the linear predictor to the mean parameter) is assumed to be logit

η=logit(μ).

Given a single continuous covariate x find dμdx and interpret.

 

 
Exercise 3.36
(Continuation). Suppose instead that η=α+βx+γx2. Find dμdx.

 

Link function in practice

The link function g(μ)=η relates the moment parameter μ=𝔼[Y] to the linear predictor η, which is a linear combination of the covariates.

The default link functions are the logit for the Bernoulli distribution and the log for the Poisson distribution. The default for the Gaussian distribution is the identity function.

There are practical and theoretical reasons for choosing these defaults. The theoretical reason is that these are the canonical links for the Bernoulli and Poisson exponential families (EFs). The practical reason is that these functions transform the mean to make interpretation of the coefficients more obvious. For instance, with the AIDS example, increases are multiplicative (hence log), while, with birthweight it is additive (hence linear or identity).