5 Parameter Functions

Examples

Example 5.3:  Normal data, ϕ=μ.

XiN(μ,σ2). We saw previously that μ^=x¯ and se(μ^)=σ/n1/2. Find the two forms of confidence interval for μ.

The asymptotic normality approach gives a (1-α)×100% confidence interval for μ as follows.

We have already obtained that

IE(θ)-1=[σ2n00σ22n].

Here ϕ=g(θ)=μ. Then, ϕ^=μ^ and g(θ)T=[1,0], so

Var(ϕ^)=[1,0][σ2n00σ22n][10]=σ2n.

Therefore the confidence interval is

(x¯-zα/2σ^n,x¯+zα/2σ^n),

where

σ^=(1ni=1n(xi-x¯)2)1/2,

is the maximum likelihood estimate of σ.

Alternatively, we can derive confidence intervals using the profile deviance. The first step is to evaluate the profile log-likelihood for μ. Recall,

(μ,σ)=-n2log(2π)-nlogσ-12σ2i=1n(xi-μ)2.

Fixing μ and maximising with respect to σ, the maximum occurs at σ^μ, where

(μ,σ^μ)σ=0.

Thus,

0=-nσ^μ+1σ^μ3i=1n(xi-μ)2,

and so

σ^μ=(1ni=1n(xi-μ)2)1/2.

Hence,

D*(μ) = -2nlog(σ^/σ^μ)+1σ^μ2i=1n(xi-μ)2-1σ^2i=1n(xi-μ^)2
= -2nlog(σ^/σ^μ)+nσ^μ2σ^μ2-nσ^2σ^2
= -2nlog(σ^/σ^μ).

For a simulated dataset of size n=25 with μ=0,σ=1, the profile deviance is plotted in the left panel of Figure Figure LABEL:nproflik (Link), leading to a 95% confidence interval of (-0.12,0.48) for μ.

Figure 5.1: First Link, Second Link, Caption: Left: Profile deviance for μ for simulated normal data. Right: Contour plot of log-likelihood surface for simulated normal data, with the path (μ,σ^μ) of the profile likelihood for μ super-imposed.

The connection between profile likelihood and likelihood is shown in the right panel of the figure. In this figure contours of the 2-dimensional likelihood surface are plotted, together with the path (μ,σ^μ) of the profile likelihood for μ.

Example 5.4:  Normal data, other ϕs.

XiN(μ,σ2). We have already obtained that

IE(θ)-1=[σ2n00σ22n].
  1. 1.

    If ϕ=g(θ)=σ2. Then, ϕ^=σ^2 and g(θ)T=(0,2σ), so

    Var(ϕ^)=[0,2σ][σ2n00σ22n][02σ]=2σ4n.
  2. 2.

    If ϕ=g(θ)=μ+σΦ-1(1-p), with p known.

    Then, ϕ^=μ^+σ^Φ-1(1-p) and g(θ)T=[1,Φ-1(1-p)], so,

    Var(ϕ^) = [1,Φ-1(1-p)][σ2n00σ22n][1Φ-1(1-p)]
    = σ22n(2+[Φ-1(1-p)]2).

In each case an approximate (1-α) confidence interval is obtained as

ϕ^±zα/2Var(ϕ^).

Example 5.5:  Gamma Distribution, ctd.

XiGamma(α,β) and ϕ=α/β, the population mean. Thus, ϕ^=α^/β^ and g(θ)T=[1/β,-α/β2]. Hence,

Var(ϕ^)[1/β,-α/β2]IE(θ)-1[1/β-α/β2]

where

IE(θ)-1=Δ-1[nαβ2nβnβnγ(α)],

and

Δ=(nβ)2(αγ(α)-1).

Example 5.6:  Simple Linear Regression, ctd.

XiN(α+βzi,σ2) with (known) explanatory variables (z1,,zn) and σ=1 also known. We obtained,

(θ)=-n2log(2π)-nlogσ-12σ2i=1n(xi-α-βzi)2.

Now, suppose we wish to obtain a confidence interval for β based on the profile-likelihood function. We have, for fixed β and σ=1,

α=(xi-α-βzi)=0

at the maximum, and so

α^β=x¯-βz¯.

Recall also that

α^=x¯-β^z¯.

Hence, we obtain

D*(β)=2{-12(xi-α^-β^zi)2+12(xi-α^β-βzi)2}.

For the data simulated in Figure 3.7, the corresponding profile deviance for β is plotted in Figure Figure 5.2 (Link), leading to a 95% confidence interval for β equal to (0.10,0.21).

Figure 5.2: Link, Caption: Profile deviance for β in regression example.

This latter example illustrates an important use of the likelihood function for model discrimination. Suppose we are interested in assessing whether there is a ‘significant’ linear relationship between the observations xi and the explanatory variables zi.

If there were no such relationship then the true value of β would be zero.

Hence, a test of the linear regression model, versus the simpler assumption that each of the responses have a common mean, is equivalent to the test that β=0.

Since 0 falls outside of our 95% confidence interval for β, this gives reasonably strong evidence that β0; we say that the hypothesis β=0 is rejected at the 5% significance level. But could we have reached this conclusion without having to go through the entire procedure of producing the profile deviance function?

In fact, yes. All that is necessary is to evaluate the profile likelihood at β=0 (equivalent to fixing β=0 and maximizing the log-likelihood with respect to the other parameters), and compare this value with the maximum likelihood value under the complete model where β is unconstrained. Doubling the difference gives the deviance, whose value can be compared with the χ12 distribution to check for significance.