most epidemiological case-maps show apparent clustering because cases occur most often in areas of high population
if this is the only source of spatial clustering, the same should be true of the control-map
more formally:
under the null hypothesis of no spatial clustering, the cases and the controls are independent random samples from the same underlying population at risk
under this hypothesis,
Hence, consider
Diggle and Chetwynd (1991) propose a test of spatial clustering using the statistic
where is the variance of under random permutation of the case-control labels. Significance is assessed either by a Normal approximation or, for an exact Monte Carlo test, by simulation from the randomisation distribution under the null. Thus, whilst the statistic is motivated by the theory of stationary point processes, the inference is design-based.
There is an extensive literature on alternative tests for spatial clustering. The Diggle-Chetwynd test does not claim to be the “best” in any sense. For a discussion of other approaches, see Cuzick and Edwards (1990).
Childhood leukaemia in Humberside, from Cuzick and Edwards (1990).
A Monte Carlo test using the test statistic with 99 simulated random labellings gave a -value of 0.14.
The Normal approximation gave a standard Normal deviate of , corresponding to a one-sided -value of 0.11.
Chetwynd et al. (2001) consider the adaptation of the above method to individually matched case-control data.
for a test of clustering, a Monte Carlo test based on is still available, comparing the observed value of with simulate values under random re-labellings within matched case-control sets
for estimation, modifications are necessary because:
the randomisation variance of changes
more fundamentally, in a -to- matched case-control study, under the null hypothesis of no spatial clustering.
Childhood diabetes in Yorkshire, England
matched case-control study
two controls per case, matched by age, sex and FHSA (Family Health Services Authority)
Note that one of the matching variables is, by definition, spatially structured, hence the matched design masks some of the underlying spatial variation. Using a random sample of controls within each FHSA may have been preferable.