4 Information and Asymptotics


From Theorems 6 and 7, the following conclusions can be drawn:

  • The maximum likelihood estimator θ^ of θ is asymptotically unbiased: E(θ^)θ0.

  • Asymptotically, Var(θ^)IE(θ0)-1, which, by a multi-parameter version of the Cramér-Rao theorem, is optimal for unbiased estimators.

  • If J=IE(θ0)-1, then Var(θ^)=J, so J is a positive definite, symmetric matrix with elements Ji,j=Cov(θ^i,θ^j); hence, Ji,i is the variance of θ^i. The quantity Ji,i1/2 is the standard error of θ^i and corr(θ^i,θ^j)=cov(θ^i,θ^j)/Var(θ^i)Var(θ^j).

  • We have argued that it is natural to base confidence regions for θ on the basis of including those values for which the likelihood is greatest (or, equivalently, the deviance is smallest). Thus, we should choose regions of the form


    for some value of cα.

    To obtain a region which is a (1-α) confidence region, cα should be chosen so that P{θ0Cα}=1-α.

  • The shape of contours of equal deviance near the MLE for large sample sizes can be studied by looking at the Taylor series expansion:


    Therefore the boundary of the confidence region is given by θ such that D(θ)=cα. This is the equation of an ellipse, which is the shape we found in our contour plots of the log-likelihood function.

  • We can now obtain such a value approximately using the fact that asymptotically D(θ0)χd2: choose cα such that P{χd2cα}=1-α. For example, if α=0.05, then we obtain the following table:

    d cα drop from (θ^)
    1 3.84 1.92
    2 5.99 3.00
    3 7.81 3.90
    4 9.49 4.75
    5 11.07 5.53
    6 12.59 6.25

    Consequently, confidence regions for θ correspond to contours of the likelihood surface, and the appropriate contour for a specified degree of confidence can be obtained from the corresponding χ2 distribution.

For example, if d=2 (a bivariate likelihood), a 95% confidence region would be

Cα = {θΩ:D(θ)<cα}
= {θΩ:2[(θ^)-(θ)]<5.99}
= {θΩ:[(θ^)-(θ)]<3}

i.e. the region is defined by parameter pairs within 3 units of the likelihood at the MLE (as shown in row 2 of the table).

Hence if we draw a likelihood contour plot with contours representing unit steps down from the maximum, a 95% confidence region would be described by the first (innermost) 3 contours (see Figure Figure 4.1 (Link)).

Figure 4.1: Link, Caption: Contour of a gamma likelihood with shaded region corresponding to an approximate 95% confidence region (innermost 3 contours).