R: Clustering Function for Kulldorff and Nagarwalla's Statistic

kn.iscluster {DCluster}

R Documentation

Clustering Function for Kulldorff and Nagarwalla's Statistic

Description

kn.iscluster is called from opgam when studying the whole area. At every point of the grid, which may be all the centroids, this function is called to determine whether it is a cluster or not by calculating Kulldorff and Nagarwalla's statistic.

See opgam.iscluster.default for more details. kn.gumbel.iscluster uses a Gumbel distribution to compute the p-values ofr each possible cluster.

Usage

kn.iscluster(data, idx, idxorder, alpha, fractpop, use.poisson=TRUE,
 model="poisson", R, mle, ...)
kn.gumbel.iscluster(data, idx, idxorder, alpha, fractpop, use.poisson=TRUE,
 model="poisson", R, mle)

Arguments

`data`	A dataframe with the data as explained in DCluster.
`idx`	A boolean vector to know the areas in the current circle.
`idxorder`	A permutation of the rows of data to order the regions according to their distance to the current center.
`alpha`	Test signifiance.
`fractpop`	Maximum fraction of the total population used when creating the balls.
`use.poisson`	Use the statistic for Poisson (default) or Bernouilli case.
`model`	Thge model used to generate random observations. It can be 'permutation', 'multinomial', 'poisson' or 'negbin'. See observed.sim manual page for details.
`R`	The number of bootstrap replicates to generate.
`mle`	Parameters need by the bootstrap procedure.
`...`	Extra arguments to be passed to kullnagar.stat().

Value

A vector of four elements, as describe in iscluster manual page.

References

Kulldorff, Martin and Nagarwalla, Neville (1995). Spatial Disease Clusters: Detection and Inference. Statistics in Medicine 14, 799-810. Abrams A, Kleinman K, Kulldorff M (2010). Gumbel based p-value approximations for spatial scan statistics. International Journal of Health Geographics, 9:61.

Examples

library(boot)
library(spdep)

data(nc.sids)

sids<-data.frame(Observed=nc.sids$SID74)
sids<-cbind(sids, Expected=nc.sids$BIR74*sum(nc.sids$SID74)/sum(nc.sids$BIR74))
sids<-cbind(sids, Population=nc.sids$BIR74, x=nc.sids$x, y=nc.sids$y)

#K&N's method over the centroids
mle<-calculate.mle(sids, model="poisson")
knresults<-opgam(data=sids, thegrid=sids[,c("x","y")], alpha=.05, 
  iscluster=kn.iscluster, fractpop=.5, R=100, model="poisson", mle=mle)

kngumbelres<-opgam(data=sids, thegrid=sids[,c("x","y")], alpha=.05, 
  iscluster=kn.gumbel.iscluster, fractpop=.5, R=100, model="poisson", 
  mle=mle)

#Plot all centroids and significant ones in red
plot(sids$x, sids$y, main="Kulldorff and Nagarwalla's method")
points(knresults$x, knresults$y, col="red", pch=19)
points(knresults$x, knresults$y, col="blue", pch=20)

[Package DCluster version 0.2-10 Index]