R: Ecological niche model using a log relative risk surface

lrren {envi}

R Documentation

Ecological niche model using a log relative risk surface

Description

Estimate the ecological niche of a single species with presence/absence data and two covariates. Predict the ecological niche in geographic space.

Usage

lrren(
  obs_locs,
  predict = FALSE,
  predict_locs = NULL,
  conserve = TRUE,
  alpha = 0.05,
  p_correct = "none",
  cv = FALSE,
  kfold = 10,
  balance = FALSE,
  parallel = FALSE,
  n_core = 2,
  poly_buffer = NULL,
  obs_window = NULL,
  verbose = FALSE,
  ...
)

Arguments

`obs_locs`	Input data frame of presence and absence observations with six (6) features (columns): 1) ID, 2) longitude, 3) latitude, 4) presence/absence binary variable, 5) covariate 1 as x-coordinate, 6) covariate 2 as y-coordinate.
`predict`	Logical. If TRUE, will predict the ecological niche in geographic space. If FALSE (the default), will not predict.
`predict_locs`	Input data frame of prediction locations with 4 features (columns): 1) longitude, 2) latitude, 3) covariate 1 as x-coordinate, 4) covariate 2 as y-coordinate. The covariates must be the same as those included in `obs_locs`.
`conserve`	Logical. If TRUE (the default), the ecological niche will be estimated within a concave hull around the locations in `obs_locs`. If FALSE, the ecological niche will be estimated within a concave hull around the locations in `predict_locs`.
`alpha`	Numeric. The two-tailed alpha level for the significance threshold (the default is 0.05).
`p_correct`	Optional. Character string specifying whether to apply a correction for multiple comparisons including a False Discovery Rate `p_correct = "FDR"`, a Sidak correction `p_correct = "Sidak"`, and a Bonferroni correction `p_correct = "Bonferroni"`. If `p_correct = "none"` (the default), then no correction is applied.
`cv`	Logical. If TRUE, will calculate prediction diagnostics using internal k-fold cross-validation. If FALSE (the default), will not.
`kfold`	Integer. Specify the number of folds used in the internal cross-validation. The default is 10.
`balance`	Logical. If TRUE, the prevalence within each k-fold will be 0.50 by undersampling absence locations (assumes absence data are more frequent). If FALSE (the default), the prevalence within each k-fold will match the prevalence in `obs_locs`.
`parallel`	Logical. If TRUE, will execute the function in parallel. If FALSE (the default), will not execute the function in parallel.
`n_core`	Optional. Integer specifying the number of CPU cores on the current host for parallelization (the default is 2 cores).
`poly_buffer`	Optional. Specify a custom distance (in the same units as covariates) to add to the window within which the ecological niche is estimated. The default is 1/100th of the smallest range among the two covariates.
`obs_window`	Optional. Specify a custom window of class 'owin' within which to estimate the ecological niche. The default computes a concave hull around the data specified in `conserve`.
`verbose`	Logical. If TRUE (the default), will print function progress during execution. If FALSE, will not print.
`...`	Arguments passed to `risk` to select bandwidth, edge correction, and resolution.

Details

This function estimates the ecological niche of a single species (presence/absence data), or the presence of one species relative to another, using two covariates, will predict the ecological niche into a geographic area and prepare k-fold cross-validation data sets for prediction diagnostics.

The function uses the risk function to estimate the spatial relative risk function and forces risk(tolerate == TRUE) in order to calculate asymptotic p-values. The estimated ecological niche can be visualized using the plot_obs function.

If predict = TRUE, this function will predict ecological niche at every location specified with predict_locs with best performance if predict_locs are gridded locations in the same study area as the observations in obs_locs - a version of environmental interpolation. The predicted spatial distribution of the estimated ecological niche can be visualized using the plot_predict function.

If cv = TRUE, this function will prepare k-fold cross-validation data sets for prediction diagnostics. The sample size of each fold depends on the number of folds set with kfold. If balance = TRUE, the sample size of each fold will be the frequency of presence locations divided by the number of folds times two. If balance = FALSE, the sample size of each fold will be the frequency of all observed locations divided by the number of folds. The cross-validation can be performed in parallel if parallel = TRUE using the future, doFuture, doRNG, and foreach packages. Two diagnostics (area under the receiver operating characteristic curve and precision-recall curve) can be visualized using the plot_cv function.

The obs_window argument may be useful to specify a 'known' window for the ecological niche (e.g., a convex hull around observed locations).

This function has functionality for a correction for multiple testing. If p_correct = "FDR", calculates a False Discovery Rate by Benjamini and Hochberg. If p_correct = "Sidak", calculates a Sidak correction. If p_correct = "Bonferroni", calculates a Bonferroni correction. If p_correct = "none" (the default), then the function does not account for multiple testing and uses the uncorrected alpha level. See the internal pval_correct function documentation for more details.

Value

An object of class 'list'. This is a named list with the following components:

out: An object of class 'list' for the estimated ecological niche.
dat: An object of class 'data.frame', returns obs_locs that are used in the accompanying plotting functions.
p_critical: A numeric value for the critical p-value used for significance tests.

The returned out is a named list with the following components:

obs: An object of class 'rrs' for the spatial relative risk.
presence: An object of class 'ppp' for the presence locations.
absence: An object of class 'ppp' for the absence locations.
outer_poly: An object of class 'matrix' for the coordinates of the concave hull around the observation locations.
inner_poly: An object of class 'matrix' for the coordinates of the concave hull around the observation locations. Same as outer_poly.

If predict = TRUE, the returned out has additional components:

outer_poly: An object of class 'matrix' for the coordinates of the concave hull around the prediction locations.
prediction: An object of class 'matrix' for the coordinates of the concave hull around the prediction locations.

If cv = TRUE, the returned object of class 'list' has an additional named list cv with the following components:

cv_predictions_rr: A list of length kfold with values of the log relative risk surface at each point randomly selected in a cross-validation fold.
cv_labels: A list of length kfold with a binary value of presence (1) or absence (0) for each point randomly selected in a cross-validation fold.

Examples

if (interactive()) {
  set.seed(1234) # for reproducibility

# Using the 'bei' and 'bei.extra' data within {spatstat.data}

# Covariate data (centered and scaled)
  elev <- spatstat.data::bei.extra[[1]]
  grad <- spatstat.data::bei.extra[[2]]
  elev$v <- scale(elev)
  grad$v <- scale(grad)
  elev_raster <- terra::rast(elev)
  grad_raster <- terra::rast(grad)

# Presence data
  presence <- spatstat.data::bei
  spatstat.geom::marks(presence) <- data.frame("presence" = rep(1, presence$n),
                                               "lon" = presence$x,
                                               "lat" = presence$y)
  spatstat.geom::marks(presence)$elev <- elev[presence]
  spatstat.geom::marks(presence)$grad <- grad[presence]

# (Pseudo-)Absence data
  absence <- spatstat.random::rpoispp(0.008, win = elev)
  spatstat.geom::marks(absence) <- data.frame("presence" = rep(0, absence$n),
                                              "lon" = absence$x,
                                              "lat" = absence$y)
  spatstat.geom::marks(absence)$elev <- elev[absence]
  spatstat.geom::marks(absence)$grad <- grad[absence]

# Combine into readable format
  obs_locs <- spatstat.geom::superimpose(presence, absence, check = FALSE)
  obs_locs <- spatstat.geom::marks(obs_locs)
  obs_locs$id <- seq(1, nrow(obs_locs), 1)
  obs_locs <- obs_locs[ , c(6, 2, 3, 1, 4, 5)]
  
# Prediction Data
  predict_xy <- terra::crds(elev_raster)
  predict_locs <- as.data.frame(predict_xy)
  predict_locs$elev <- terra::extract(elev_raster, predict_xy)[ , 1]
  predict_locs$grad <- terra::extract(grad_raster, predict_xy)[ , 1]

# Run lrren
  test_lrren <- lrren(obs_locs = obs_locs,
                      predict_locs = predict_locs,
                      predict = TRUE,
                      cv = TRUE)
}

[Package envi version 0.1.19 Index]