scan_pb_poisson {scanstatistics} | R Documentation |
Calculate the population-based Poisson scan statistic.
Description
Calculate the population-based Poisson scan statistic devised by Kulldorff (1997, 2001).
Usage
scan_pb_poisson(
counts,
zones,
population = NULL,
n_mcsim = 0,
gumbel = FALSE,
max_only = FALSE
)
Arguments
counts |
Either:
|
zones |
A list of integer vectors. Each vector corresponds to a single zone; its elements are the numbers of the locations in that zone. |
population |
Optional. A matrix or vector of populations for each
location and time point. Only needed if |
n_mcsim |
A non-negative integer; the number of replicate scan statistics to generate in order to calculate a P-value. |
gumbel |
Logical: should a Gumbel P-value be calculated? Default is
|
max_only |
Boolean. If |
Value
A list which, in addition to the information about the type of scan statistic, has the following components:
- MLC
A list containing the number of the zone of the most likely cluster (MLC), the locations in that zone, the duration of the MLC, the calculated score, and the relative risk inside and outside the cluster. In order, the elements of this list are named
zone_number, locations, duration, score, relrisk_in, relrisk_out
.- observed
A data frame containing, for each combination of zone and duration investigated, the zone number, duration, score, relative risks. The table is sorted by score with the top-scoring location on top. If
max_only = TRUE
, only contains a single row corresponding to the MLC.- replicates
A data frame of the Monte Carlo replicates of the scan statistic (if any), and the corresponding zones and durations.
- MC_pvalue
The Monte Carlo
P
-value.- Gumbel_pvalue
A
P
-value obtained by fitting a Gumbel distribution to the replicate scan statistics.- n_zones
The number of zones scanned.
- n_locations
The number of locations.
- max_duration
The maximum duration considered.
- n_mcsim
The number of Monte Carlo replicates made.
References
Kulldorff, M. (1997). A spatial scan statistic. Communications in Statistics - Theory and Methods, 26, 1481–1496.
Kulldorff, M. (2001). Prospective time periodic geographical disease surveillance using a scan statistic. Journal of the Royal Statistical Society, Series A (Statistics in Society), 164, 61–72.
Examples
set.seed(1)
# Create location coordinates, calculate nearest neighbors, and create zones
n_locs <- 50
max_duration <- 5
n_total <- n_locs * max_duration
geo <- matrix(rnorm(n_locs * 2), n_locs, 2)
knn_mat <- coords_to_knn(geo, 15)
zones <- knn_zones(knn_mat)
# Simulate data
population <- matrix(rnorm(n_total, 100, 10), max_duration, n_locs)
counts <- matrix(rpois(n_total, as.vector(population) / 20),
max_duration, n_locs)
# Inject outbreak/event/anomaly
ob_dur <- 3
ob_cols <- zones[[10]]
ob_rows <- max_duration + 1 - seq_len(ob_dur)
counts[ob_rows, ob_cols] <- matrix(
rpois(ob_dur * length(ob_cols), 2 * population[ob_rows, ob_cols] / 20),
length(ob_rows), length(ob_cols))
res <- scan_pb_poisson(counts = counts,
zones = zones,
population = population,
n_mcsim = 99,
max_only = FALSE)