1 Lab Session 2 Simulate Data Comparing the clustering between cases and controls

Testing for spatial clustering

We now use our simulated data to test for complete spatial randomness. We utilise the $K$ -function to perform the test, as discussed in the lecture, by comparing the estimate from the data, $\hat{K}(s)$ , with $\pi s^{2}$ , the theoretical value under complete spatial randomness. The main function in spatstat for estimating the $K$ function is Kest.

We can estimate and plot the $K$ -function for the point pattern patt1 using the following:

k1 <- Kest(patt1)
plot(k1)

Exercise 1.1.

Read the help file ?Kest. Use the str function to look at the structure of k1.

By default Kest computes versions of the $K$ -function using three different edge-correction methods: border, isotropic and translate, details of which appear in Ripley (1988). Notice also in the plot that the theoretical Poisson $K$ -function has been plotted. We will concentrate on the estimate of the $K$ -function using the isotropic correction method. The isotropic correction method was covered in the notes, it is the method that adjusts (i.e., weights) the $K$ -function estimate using the reciprocal of proportion of circumference of circle, centred at each point contained in the observation window.

One thing thing to note at this stage is that the function Kest has automatically chosen a (sensible?) range of radii at which to evaluate $\hat{K}$ . We can extract the radii and the appropriate edge-corrected estimate of $K$ from the object k1 as follows:

r1 <- k1$r
kest1 <- k1$iso

and plot it, together with the theoretical $K$ -function for the homogeneous Poisson process:

plot(r1, kest1, type = "l")
curve(pi * x^2, col = "red", lty = "dashed", add = TRUE)

Typically, the estimated $K$ -function will not lie exactly over the line representing the theoretical $K$ -function. It is therefore useful to be able to add a confidence region to the plot as an aid to interpretation. We will do this by simulating 200 (say) realisations of a point process of the same estimated intensity on the same observation window using the same values of r. We then compute the 0.025 and 0.975 quantiles of the simulated $K$ -functions at each value of r. Lastly, we will add this information to the plot.

An unbiased estimate of the intensity of the process, $\hat{\lambda}$ , is given by the number of points divided by the area of the observation window.

lambdahat1 <- patt1$n/area.owin(patt1$window)
nsims <- 200  # this is the number of processes we will simulate

In this case, the estimate of the intensity is $\hat{\lambda}=96$ . Next we generate 200 homogeneous Poisson processes on the same window and store them in an object, kfuns1.

kfuns1 <- c()  # initialise the object
for (i in 1:nsims) {
    simdata <- rpoispp(lambda = lambdahat1, win = win1)
    ksim <- Kest(simdata, r = r1)$iso
    kfuns1 <- cbind(kfuns1, ksim)
}

The quantiles of interest are given by:

qts <- apply(kfuns1, 1, quantile, probs = c(0.025,
    0.975))

Finally, we add them to the plot using,

plot(NULL, xlab = "r", ylab = "Estimated K function",
    xlim = range(r1), ylim = range(kfuns1))
polygon(c(r1, rev(r1)), c(qts[1, ], rev(qts[2, ])),
    col = "lightgrey", border = NA)
lines(r1, kest1)
curve(pi * x^2, col = "red", lty = "dashed", add = TRUE)

The horn shape of the tolerance envelope suggests that the larger the distance $r$ , the more variable is the $K$ function. Note that there is no need to repeat the Monte Carlo calculation of the tolerance limits, as long as the size of the dataset and the distance range $s$ are not altered.

Exercise 1.2.

What are your conclusions from the plot?

We expect that $\hat{K}$ will fall inside the tolerance envelope at the majority of the cases. If we had a cluster process, we would expect that the estimated $K$ function will be mostly outside the envelope, and in particular above it, where as for an inhibitory point process it would lie below the envelope.

Exercise 1.3.

Repeat the above computations for the inhomogeneous process, patt3: produce an estimate of the $K$ -function with a confidence band for this process.