We have a log-likelihood with in a -dimensional subspace of for parameters , and link function . We can maximise this to obtain ML estimates . Likelihood theory tells us that in the limit as , if are the true parameter values, then
where is the expected information matrix evaluated at , and
R estimates the asymptotic variance in this way, and calculates standard errors of the parameters as the square roots of the diagonal elements of the matrix:
From the standard errors we can construct approximate 100% confidence intervals:
where is the 100% quantile of the standard Normal distribution for some significance level .
For example, consider the birthweight data set. The following R extract fits a simple linear regression:
birthweight <- read.table("birthweight.dat") model <- glm(weight ~ age, family = gaussian, data = birthweight) summary(model)
The summary of the model fit is:
Deviance Residuals: Min 1Q Median 3Q Max -262.03 -158.29 8.35 88.15 366.50 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -1485.0 852.6 -1.742 0.0955 . age 115.5 22.1 5.228 3.04e-05 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 (Dispersion parameter for gaussian family taken to be 37094.29) Null deviance: 1829873 on 23 degrees of freedom Residual deviance: 816074 on 22 degrees of freedom AIC: 324.53
Exercise 7.53
Calculate the 95% confidence interval for the age coefficient.