7 Model inference

7.1 Confidence intervals for parameters

We have a log-likelihood (𝜷)=(𝝁) with g(𝝁)=𝜼=X𝜷 in a p-dimensional subspace of n for parameters 𝜷, and link function g. We can maximise this to obtain ML estimates 𝜷^. Likelihood theory tells us that in the limit as n, if 𝜷 are the true parameter values, then

𝜷^N(𝜷,V),withV=E(𝜷^)-1,

where E(𝜷^) is the expected information matrix E(𝜷) evaluated at 𝜷^, and

E(𝜷)jk=𝔼[-2(𝜷)βjβk].

R estimates the asymptotic variance in this way, and calculates standard errors of the parameters as the square roots of the diagonal elements of the matrix:

std(β^k)=Vkk.

From the standard errors we can construct approximate (1-α)100% confidence intervals:

(β^k-z1-α2×std(β^k),β^k+z1-α2×std(β^k)),

where z1-α2 is the (1-α2)100% quantile of the standard Normal distribution for some significance level α.

For example, consider the birthweight data set. The following R extract fits a simple linear regression:

weighti N(μi,σ2)withμi=ηi
ηi =β0+β1ageifori=1,,n
birthweight <- read.table("birthweight.dat")
model <- glm(weight ~ age, family = gaussian, data = birthweight)
summary(model)

The summary of the model fit is:

Deviance Residuals:
    Min       1Q   Median       3Q      Max
-262.03  -158.29     8.35    88.15   366.50

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)  -1485.0      852.6  -1.742   0.0955 .
age            115.5       22.1   5.228 3.04e-05 ***
---
Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1   1

(Dispersion parameter for gaussian family taken to be 37094.29)

    Null deviance: 1829873  on 23  degrees of freedom
Residual deviance:  816074  on 22  degrees of freedom
AIC: 324.53

 
Exercise 7.53
Calculate the 95% confidence interval for the age coefficient.