4 Information and Asymptotics

Discussion

From Theorems 6 and 7, the following conclusions can be drawn:

  • The maximum likelihood estimator θ^ of θ is asymptotically unbiased: E(θ^)θ0.

  • Asymptotically, Var(θ^)IE(θ0)-1, which, by a multi-parameter version of the Cramér-Rao theorem, is optimal for unbiased estimators.

  • If J=IE(θ0)-1, then Var(θ^)=J, so J is a positive definite, symmetric matrix with elements Ji,j=Cov(θ^i,θ^j); hence, Ji,i is the variance of θ^i. The quantity Ji,i1/2 is the standard error of θ^i and corr(θ^i,θ^j)=cov(θ^i,θ^j)/Var(θ^i)Var(θ^j).

  • We have argued that it is natural to base confidence regions for θ on the basis of including those values for which the likelihood is greatest (or, equivalently, the deviance is smallest). Thus, we should choose regions of the form

    Cα={θΩ:D(θ)<cα}

    for some value of cα.

    To obtain a region which is a (1-α) confidence region, cα should be chosen so that P{θ0Cα}=1-α.

  • The shape of contours of equal deviance near the MLE for large sample sizes can be studied by looking at the Taylor series expansion:

    D(θ)(θ-θ^)TIE(θ^)(θ-θ^).

    Therefore the boundary of the confidence region is given by θ such that D(θ)=cα. This is the equation of an ellipse, which is the shape we found in our contour plots of the log-likelihood function.

  • We can now obtain such a value approximately using the fact that asymptotically D(θ0)χd2: choose cα such that P{χd2cα}=1-α. For example, if α=0.05, then we obtain the following table:

    d cα drop from (θ^)
    1 3.84 1.92
    2 5.99 3.00
    3 7.81 3.90
    4 9.49 4.75
    5 11.07 5.53
    6 12.59 6.25

    Consequently, confidence regions for θ correspond to contours of the likelihood surface, and the appropriate contour for a specified degree of confidence can be obtained from the corresponding χ2 distribution.

For example, if d=2 (a bivariate likelihood), a 95% confidence region would be

Cα = {θΩ:D(θ)<cα}
= {θΩ:2[(θ^)-(θ)]<5.99}
= {θΩ:[(θ^)-(θ)]<3}

i.e. the region is defined by parameter pairs within 3 units of the likelihood at the MLE (as shown in row 2 of the table).

Hence if we draw a likelihood contour plot with contours representing unit steps down from the maximum, a 95% confidence region would be described by the first (innermost) 3 contours (see Figure Figure 4.1 (Link)).

Figure 4.1: Link, Caption: Contour of a gamma likelihood with shaded region corresponding to an approximate 95% confidence region (innermost 3 contours).