This section demonstrates how to fit the geostatistical model to the magnesium soil data example discussed in lectures. The data for this example comes from the geoR package; we will use the gstat package to analyse them. Run the following lines if necessary (try typing library(geoR) first, for instance).
install.packages("geoR", dependencies = TRUE) install.packages("gstat", dependencies = TRUE)
Then load them into R:
library(geoR)
Error in library(geoR): there is no package called ’geoR’
library(gstat)
Next we read in the data
data(camg)
Type head(camg) to look at the top few rows of the data. We can convert this object into a SpatialPointsDataFrame object (from the package sp) as follows:
coordinates(camg) = ~east + north
Error in coordinates(camg) = ~east + north: object ’camg’ not found
The SpatialPointsDataFrame is another example of an S4 class of objects, it contains both information on the locations of points as well as having a data.frame containing covariate information for each point. We will examine the magnesium concentration in the 0-20cm soil layer, mg020. You can plot the locations of the data, as well as specific attributes of the data as follows:
plot(camg)
Error in plot(camg): object ’camg’ not found
spplot(camg, "mg020")
Error in spplot(camg, "mg020"): object ’camg’ not found
Let denote the magnesium concentration at location and be a spatial Gaussian process with Exponential correlation function. We aim to fit the following two models to the :
(1.1) | |||||
(1.2) |
Model (1.1) is the simplest of the two; in this model, we model the magnesium concentration directly as a noisy realisation of the spatial Gaussian process, . In model (1.2), the inclusion of spatially referenced covariate effects , , represents the explained variation in the prediction surface, and the interpretation of in this case is the unexplained variation.
The first step is to estimate the variogram for these data:
mgvg <- variogram(mg020 ~ 1, data = camg)
Error in variogram.formula(mg020 ~ 1, data = camg): object ’camg’ not found
and plot it:
plot(mgvg)
Error in plot(mgvg): object ’mgvg’ not found
Now next find the MLEs for the parameters in the stationary Gaussian model:
mgvgfit <- fit.variogram(mgvg, model = vgm(model = "Exp", psill = 30, range = 200, nugget = 10))
Error in fit.variogram(mgvg, model = vgm(model = "Exp", psill = 30, range = 200, : object ’mgvg’ not found
mgvgfit
Error in eval(expr, envir, enclos): object ’mgvgfit’ not found
and add them to the plot
plot(mgvg, mgvgfit)
Error in plot(mgvg, mgvgfit): object ’mgvg’ not found
The help file for the vgm function above does not make clear the exact role of the arguments when they are used inside the function fit.variogram, however according to Bivand et al. (2013) these are initial values for the variogram fit. In this case, my choice of psill=30, range=200, nugget=10 was based on inspection of the empirical variogram, but the initial values don’t appear to be too sensitive in this case.
Try checking with several different initial values, has the procedure achieved convergence?
We are now in a position to be able to do the kriging, but before we begin, we’ll construct an output grid onto which we will predict the surface. This will be a regular grid covering the observation window, and this will be stored using an sp object of class SpatialPixels.
bb <- bbox(camg) # bounding box
Error in bbox(camg): object ’camg’ not found
delta <- 20
xseq <- seq(bb[1, 1], bb[1, 2], by = delta)
Error in seq(bb[1, 1], bb[1, 2], by = delta): object ’bb’ not found
yseq <- seq(bb[2, 1], bb[2, 2], by = delta)
Error in seq(bb[2, 1], bb[2, 2], by = delta): object ’bb’ not found
predictgrid <- SpatialPixels(SpatialPoints(expand.grid(xseq,
yseq)))
Error in expand.grid(xseq, yseq): object ’xseq’ not found
We perform the kriging step as follows:
mgkrige = krige(mg020 ~ 1, camg, predictgrid, model = mgvgfit)
Error in krige(mg020 ~ 1, camg, predictgrid, model = mgvgfit): object ’camg’ not found
This produces an object of class SpatialPixelsDataFrame, which contains the predictions and marginal prediction variances.
spplot(mgkrige, "var1.pred")
Error in spplot(mgkrige, "var1.pred"): object ’mgkrige’ not found
spplot(mgkrige, "var1.var")
Error in spplot(mgkrige, "var1.var"): object ’mgkrige’ not found
These predictions are marginally (in fact they are jointly) Gaussian, so we can also produce maps of cell-wise exceedances, for example showing or . Such maps are more useful to scientists than maps of the prediction surface
Produce a map illustrating .
Note that although we have been using an exponential covariance model to estimate a parametric form for the variogram, the methods illustrated above make no mention of the common parameters for this model, and . Recall from the notes that the variogram can be written as:
where is the variance parameter, is the range parameter, as the measurement error parameter or nugget effect, and as the sill. So we can get a handle on these parameters from the object mgvgfit.
mgvgfit$psill
Error in eval(expr, envir, enclos): object ’mgvgfit’ not found
mgvgfit$range
Error in eval(expr, envir, enclos): object ’mgvgfit’ not found
tau2 <- mgvgfit$psill[1]
Error in eval(expr, envir, enclos): object ’mgvgfit’ not found
sigma2 <- mgvgfit$psill[2]
Error in eval(expr, envir, enclos): object ’mgvgfit’ not found
phi <- mgvgfit$range[2]
Error in eval(expr, envir, enclos): object ’mgvgfit’ not found
And plotting just to confirm that this is indeed correct:
x <- seq(0, 500, length.out = 1000) y <- tau2 + sigma2 * (1 - exp(-x/phi))
Error in eval(expr, envir, enclos): object ’tau2’ not found
plot(mgvg$dist, mgvg$gamma, ylim = c(0, 45), xlim = c(0,
450))
Error in plot(mgvg$dist, mgvg$gamma, ylim = c(0, 45), xlim = c(0, 450)): object ’mgvg’ not found
lines(x, y, col = "red")
Error in xy.coords(x, y): ’x’ and ’y’ lengths differ
Complete the following exercises.
Plot the predicted surface using different parameter values. How does this change according to and ?
Repeat the analysis above, including elevation and region (in the camg data-set) as covariates in the model. To do this, you will need to re-estimate the variogram using the variogram function and then re-estimate the parameters of using fit.variogram.