7 Continuous Spatial Variation

7.3 Model Fitting Via The Variogram

Using (7.1) as the model of interest, our ultimate goal is to predict the process Y at locations where we do not have data. In the mining industry (the origin of geostatistics), this concept has obvious utility: based on core samples extracted from a set of locations, where should we set up our mine (a costly and time-intensive operation) so it will be most productive? In environmental epidemiology, we are often concerned with estimating the burden or risk of disease: similar concepts can be applied.

Spatial prediction (or kriging) via variogram estimation (as opposed to maximum likelihood estimation) usually involves at least four stages: (i) produce initial estimates of β using ordinary least squares (OLS) regression (ii) produce estimates of τ, σ and ϕ (iii) re-estimate β using EGLS (iv) produce kriged estimates of Y. Step (iii) is optional, but if we are interested in making inference about the parameters, β, then we do really need to take account of the fact that the observations are not independent (which is assumed by OLS) before presenting the results of our analysis. Steps (ii) and (iii) can also be iterated.

One of the main ingredients for producing spatial predictions of Y is an estimate of the surface S(x) and its variability at locations x where we do not necessarily have data. In order to obtain estimates of S(x) and 𝕍[S(x)] we first estimate the second order properties of S+ε. The variogram is an exploratory tool for doing this; it is also used in longitudinal data analysis.

The variogram is defined as:

V(x,x)=12𝕍{S(x)-S(x)}

For a stationary process, this quantity (known as the semivariance) can be estimated as

V(uij)=12𝔼[(Yi-Ziβ^)-(Yj-Zjβ^)]2,

for each distance uij=xi-xj, where Zi is the covariate data for the ith individual and β^ has been computed using OLS (for example). The resulting point estimates i.e. 12[(Yi-Ziβ^)-(Yj-Zjβ^)]2, computed using all possible pairs (i,j), can be plotted against uij as a ‘variogram cloud’ (Figure 7.2, left plot), or averaged for similar values of uij to produce a ‘binned variogram’ (Figure 7.2, right plot). In this example, we have not used covariates, so Figure 7.2 shows the variogram for 12𝔼[Yi-Yj]2.

Figure 7.2: Link, Caption: Estimating the variogram from the magnesium soil content data, see Example 7.3. The left plot shows a variogram cloud, which is computed from the data: for all pairs of points (i,j), their physical distance uij (on the x-axis) is plotted against the empirical semi-variance, 12[(Yi-Ziβ^)-(Yj-Zjβ^)]2 (on the y-axis) - this forms a cloud of points. In order to compute an estimate of the variogram, we can bin the distances on the x-axis and plot these against the mean of the empirical semivariances within each bin – this leads to a binned variogram, shown on the right. The binned variogram reveals a broadly increasing relationship, as might be expected: we are working with a hypothesis that things closer together are more similar than things further apart. The next stage in such an analysis might be to postulate a model for the relationship in the variogram, and thus estimate how quickly dependency decays with respect to distance.

The variogram helps to provide plausible initial estimates of the parameters of the process, as it can be shown that

V(u)=τ2+σ2{1-ρ(u)}.

Fitting the stationary Gaussian model with exponential correlation function using ordinary least squares regression, we obtain:

σ^2=35.2,ϕ^=126.267,τ^2=8.303

σ2 is known as the variance parameter, ϕ as the range parameter,τ2 as the measurement error parameter or nugget effect, and σ2+τ2 as the sill.

Figure LABEL:mgvarianceplot.pdf below shows the shape of the fitted covariance function, i.e. σ2ρ(u) with parameters set at the ordinary least squares estimates.

Figure 7.3: Link, Caption: Showing the fitted covariance function (covariance on the y-axis plotted against distance, u on the x-axis) from the analysis of the magnesium soil content data, Example 7.3. As distance increases, the covariance decreases. Note that for interpretability, it is usually better to visualise the correlation function, rather than the covariance function, as it is scale-independent.
Example 7.6.

Exam Question 2014

Suppose that it is desired to fit the following model to a set of geostatistical data

Yi = Xiβ+S(xi)+Zi,
Zi N(0,τ2),

where {Y1,,Yn} are measurements at locations {x1,,xn}; Xi is a vector of covariate information at each xi; and S(xi) is the value of a zero-mean second order stationary spatial Gaussian process at xi. Suppose S has an exponential correlation function with variance parameter σ2 and spatial decay parameter ϕ. Figure 7.4 shows the variogram used to produce the estimates σ2, ϕ and τ2 used to create Σ. Use this plot to suggest estimates for σ2, ϕ and τ2 stating any results you use.

Figure 7.4: Link, Caption: Plot of fitted variogram for Example 7.6. This plot illustrates an exponential model for the variogram (blue line) with points from the raw binned variogram as blue dots; the numbers are not relevant for this example.

Solution:

The variogram helps to provide plausible initial estimates of the parameters of the process S as it can be shown that V(u)=τ2+σ2{1-ρ(u)}, where ρ is the correlation function. Since ρ(u) tends to zero as u, the sill is an estimate of τ2+σ2 and since ρ(0)=1, the intercept is an estimate of τ2, hence from the plot σ2=68118-11915=56203, τ2=11915 (the intercept). For ϕ choose a point on the line and note the value of u and V(u), rearranging the above, we get ϕ=-u/{log[1-(V(u)-τ2)/σ2]} with u=500 and V(u)=56000, this gives ϕ=326 (the exact answer is 293).