7 Continuous Spatial Variation

7.2 Model Fitting Via Maximum Likelihood

The model in Equation 7.1 can be written in matrix form as:

Y=MVN(Zβ,Σσ,ϕ+τ2I),

where Z is the design matrix of covariate measurements, β are the covariate effects, Σσ,ϕ is the variance of the spatial process S at the locations x1,,xn, I is the (n×n) identity matrix and τ2 is the variance of the spatially uncorrelated random effects. We can write Σσ,ϕ=σ2Rϕ, where the i,j-element of the matrix R is formed by evaluating ρ(||xi-xj||) and |||| denotes Euclidean distance. Put ν2=τ2/σ2 and let Mν,ϕ=Rϕ+ν2I, we re-write the model above as

YMVN(Zβ,σ2Mν,ϕ). (7.3)

The log-likelihood for a set of data Y is then:

logπ(data|β,σ,ν,ϕ)=const-12log|σ2Mν,ϕ|-12σ2(Y-Zβ)TMν,ϕ-1(Y-Zβ). (7.4)

For given ν and ϕ, we can maximise the expression above exactly using estimated generalised least squares (EGLS); this yields estimates of β and σ, denoted respectively β^(1) and σ^(1). Now, given β^(1) and σ^(1), we can numerically maximise the profile log-likelihood to get estimates for ν and ϕ, denoted ν^(1) and ϕ^(1). Next, given ν^(1) and ϕ^(1), we can maximise exactly to obtain β^(2) and σ^(2) and again numerically maximise the resulting profile likelihood to obtain ν^(2) and ϕ^(2).

Iterating this procedure, we find the MLEs of β, σ, ν and ϕ and the estimated Hessian, from which we can derive confidence intervals. There are two issues worth noting.

  • As with all complex maximum likelihood routines, ideally one should attempt the maximisation process from several starting values. Note that the variogram (see below) can be used to generate initial values.

  • Since the above algorithm requires the invertion of an n×n matrix, it can be prohibitively slow for large datasets

Regardless of which method of estimation is chosen, it is always a good idea to plot the variogram (see below) to check if the MLEs of β, σ, ν and ϕ look sensible.

Example 7.4.

Estimated Generalised Least Squares (I) Here we show how to derive the MLE for β in a model with correlated residuals. Start with Y=Xβ+ϵ, where ϵN(0,Σ) and Σ is a covariance matrix. Since Σ is symmetric and positive definite, we can use the eigendecomposition to write it as Σ=UΛUT, which means Σ-1=UΛ-1UT. We write UΛ-1UT=UΛ-1/2(Λ-1/2)TUT=VVT. Pre-multiplying the original equation by VT gives VTY=VTXβ+VTϵ, so 𝕍(VTϵ)=VTΣV=I, where I is the identity matrix. Therefore, in pre-multiplying by VT, we reduce our original model to one that can be solved by ordinary least squares and we can derive the EGLS estimate of β by solving the matrix equation VTY=VTXβ for β. Premultiplying by XTV gives

XTVVTY = XTVVTXβ
XTΣ-1Y = XTΣ-1Xβ
XTΣ-1Y = XTΣ-1Xβ
β^𝖤𝖦𝖫𝖲 = (XTΣ-1X)-1XTΣ-1Y (7.5)
Example 7.5.

Estimated Generalised Least Squares (II) Here we show how to derive the MLE for β and σ2 in Equation 7.3. Applying EGLS (i.e. Equation 7.5) to this model yields β^=(XT(σ2Mν,ϕ)-1X)-1XT(σ2Mν,ϕ)-1Y; the σs cancel, hence

β^=(XTMν,ϕ-1X)-1XTMν,ϕ-1Y.

Now suppose that we have obtained the MLE for β so that conditional on this parameter, YMVN(Zβ^,σ2Mν,ϕ)

First observe that Y-𝔼(Y)=Y-Zβ^MVN(0,σ2Mν,ϕ) and so Mν,ϕ-1/2(Y-Zβ^)MVN(0,σ2) i.e. the variance of the column vector v=Mν,ϕ-1/2(Y-Zβ^) is an estimate of σ2. Since 𝔼(v)=0 we compute this as 1nvTv, that is,

σ^2=1n(Y-Zβ^)TMν,ϕ-1(Y-Zβ^).