hotspot_gistar {sfhotspot} | R Documentation |
Identify significant spatial clusters of points
Description
Identify hotspot and coldspot locations, that is cells in a regular grid in which there are more/fewer points than would be expected if the points were distributed randomly.
Usage
hotspot_gistar(
data,
cell_size = NULL,
grid_type = "rect",
kde = TRUE,
bandwidth = NULL,
bandwidth_adjust = 1,
grid = NULL,
weights = NULL,
nb_dist = NULL,
include_self = TRUE,
p_adjust_method = NULL,
quiet = FALSE,
...
)
Arguments
data |
|
cell_size |
|
grid_type |
|
kde |
|
bandwidth |
|
bandwidth_adjust |
single positive |
grid |
|
weights |
|
nb_dist |
The distance around a cell that contains the neighbours of
that cell, which are used in calculating the statistic. If this argument is
|
include_self |
Should points in a given cell be counted as well as
counts in neighbouring cells when calculating the values of
Gi*
(if |
p_adjust_method |
The method to be used to adjust p-values for
multiple comparisons. |
quiet |
if set to |
... |
Further arguments passed to |
Details
This function calculates the Getis-Ord
Gi*
(gi-star) or
Gi*
Z
-score statistic for identifying clusters of point locations. The
underlying implementation uses the localG
function to
calculate the Z
scores and then p.adjustSP
function to adjust the corresponding p
-values for multiple comparison.
The function also returns counts of points in each cell and (by default but
optionally) kernel density estimates using the kde
function.
Coverage of the output data
The grid produced by this function covers the convex hull of the input data
layer. This means the result may include
Gi* or
Gi*
values for cells that are outside the area for which data were provided,
which could be misleading. To handle this, consider cropping the output layer
to the area for which data are available. For example, if you only have crime
data for a particular district, crop the output dataset to the district
boundary using st_intersection
.
Automatic cell-size selection
If no cell size is given then the cell size will be set so that there are 50
cells on the shorter side of the grid. If the data
SF object is projected
in metres or feet, the number of cells will be adjusted upwards so that the
cell size is a multiple of 100.
Value
An sf
tibble of regular grid cells with
corresponding point counts,
Gi* or
Gi*
values and (optionally) kernel density estimates for each cell. Values
greater than zero indicate more points than would be expected for randomly
distributed points and values less than zero indicate fewer points.
Critical values of
Gi* and
Gi*
are given in the manual page for localG
.
The output from this function can be plotted in the same way as for other
SF objects, for which see vignette("sf5", package = "sf")
.
References
Getis, A. & Ord, J. K. (1992). The Analysis of Spatial Association by Use of Distance Statistics. Geographical Analysis, 24(3), 189-206. doi:doi:10.1111/j.1538-4632.1992.tb00261.x
Examples
library(sf)
# Transform data to UTM zone 15N so that cell_size and bandwidth can be set
# in metres
memphis_robberies_utm <- st_transform(memphis_robberies_jan, 32615)
# Automatically set grid-cell size, bandwidth and neighbour distance
hotspot_gistar(memphis_robberies_utm)
# Manually set grid-cell size in metres, since the `memphis_robberies`
# dataset uses a co-ordinate reference system (UTM zone 15 north) that is
# specified in metres
hotspot_gistar(memphis_robberies_utm, cell_size = 200)
# Automatically set grid-cell size and bandwidth for lon/lat data, since it
# is not intuitive to set these values manually in decimal degrees. To do
# this it is necessary to not calculate KDEs due to a limitation in the
# underlying function.
hotspot_gistar(memphis_robberies, kde = FALSE)