R: Calculate basic spatial coverage and diversity metrics

sdSumry {divvy}

R Documentation

Calculate basic spatial coverage and diversity metrics

Description

Summarise the geographic scope and position of occurrence data, and optionally estimate diversity and evenness

Usage

sdSumry(
  dat,
  xy,
  taxVar,
  crs = "epsg:4326",
  collections = NULL,
  quotaQ = NULL,
  quotaN = NULL,
  omitDom = FALSE
)

Arguments

`dat`	A `data.frame` or `matrix` containing taxon names, coordinates, and any associated variables; or a list of such structures.
`xy`	A vector of two elements, specifying the name or numeric position of columns in `dat` containing coordinates, e.g. longitude and latitude. Coordinates for any shared sampling sites should be identical, and where sites are raster cells, coordinates are usually expected to be cell centroids.
`taxVar`	The name or numeric position of the column containing taxonomic identifications. `taxVar` must be of same class as `xy`, e.g. a numeric column position if `xy` is given as a vector of numeric positions.
`crs`	Coordinate reference system as a GDAL text string, EPSG code, or object of class `crs`. Default is latitude-longitude (`EPSG:4326`).
`collections`	The name or numeric position of the column containing unique collection IDs, e.g. 'collection_no' in PBDB data downloads.
`quotaQ`	A numeric value for the coverage (quorum) level at which to perform coverage-based rarefaction (shareholder quorum subsampling).
`quotaN`	A numeric value for the quota of taxon occurrences to subsample in classical rarefaction.
`omitDom`	If `omitDom = TRUE` and `quotaQ` or `quotaN` is supplied, remove the most common taxon prior to rarefaction. The `nTax` and `evenness` returned are unaffected.

Details

sdSumry() compiles metadata about a sample or list of samples, before or after spatial subsampling. The function counts the number of collections (if requested), taxon presences (excluding repeat incidences of a taxon at a given site), and unique spatial sites; it also calculates site centroid coordinates, latitudinal range (degrees), great circle distance (km), mean pairwise distance (km), and summed minimum spanning tree length (km). Coordinates and their distances are computed with respect to the original coordinate reference system if supplied, except in calculation of latitudinal range, for which projected coordinates are transformed to geodetic ones. If crs is unspecified, by default points are assumed to be given in latitude-longitude and distances are calculated with spherical geometry.

The first two diversity variables returned are the raw count of observed taxa and the Summed Common species/taxon Occurrence Rate (SCOR). SCOR reflects the degree to which taxa are common/widespread and is decoupled from richness or abundance (Hannisdal et al. 2012). SCOR is calculated as the sum across taxa of the log probability of incidence, \lambda. For a given taxon, \lambda = -ln(1 - p), where p is estimated as the fraction of occupied sites. Very widespread taxa make a large contribution to an assemblage SCOR, while rare taxa have relatively little influence.

If quotaQ is supplied, sdSumry() rarefies richness at the given coverage value and returns the point estimate of richness (Hill number 0) and its 95% confidence interval, as well as estimates of evenness (Pielou's J) and frequency-distribution sample coverage (given by iNEXT$DataInfo). If quotaN is supplied, sdSumry() rarefies richness to the given number of occurrence counts and returns the point estimate of richness and its 95% confidence interval. Coverage-based and classical rarefaction are both calculated with iNEXT::estimateD() internally. For details, such as how diversity is extrapolated if sample coverage is insufficient to achieve a specified rarefaction level, consult Chao and Jost (2012) and Hsieh et al. (2016).

Value

A matrix of spatial and optional diversity metrics. If dat is a list of data.frame objects, output rows correspond to input elements.

References

Chao A, Jost L (2012). “Coverage-based rarefaction and extrapolation: standardizing samples by completeness rather than size.” Ecology, 93(12), 2533–2547. doi:10.1890/11-1952.1.

Hannisdal B, Henderiks J, Liow LH (2012). “Long-term evolutionary and ecological responses of calcifying phytoplankton to changes in atmospheric CO2.” Global Change Biology, 18(12), 3504–3516. doi:10.1111/gcb.12007.

Hsieh TC, Ma KH, Chao A (2016). “iNEXT: an R package for rarefaction and extrapolation of species diversity (Hill numbers).” Methods in Ecology and Evolution, 7(12), 1451–1456. doi:10.1111/2041-210X.12613.

Examples

# generate occurrences
set.seed(9)
x  <- sample(rep(1:5, 10))
y  <- sample(rep(1:5, 10))
# make some species 2x or 4x as common
abund <- c(rep(4, 5), rep(2, 5), rep(1, 10))
sp <- sample(letters[1:20], 50, replace = TRUE, prob = abund)
obs <- data.frame(x, y, sp)

# minimum sample data returned
sdSumry(obs, c('x','y'), 'sp')

# also calculate evenness and coverage-based rarefaction diversity estimates
sdSumry(obs, xy = c('x','y'), taxVar = 'sp', quotaQ = 0.7)

[Package divvy version 1.0.0 Index]