| lrren {envi} | R Documentation |
Ecological niche model using a log relative risk surface
Description
Estimate the ecological niche of a single species with presence/absence data and two covariates. Predict the ecological niche in geographic space.
Usage
lrren(
obs_locs,
predict = FALSE,
predict_locs = NULL,
conserve = TRUE,
alpha = 0.05,
p_correct = "none",
cv = FALSE,
kfold = 10,
balance = FALSE,
parallel = FALSE,
n_core = 2,
poly_buffer = NULL,
obs_window = NULL,
verbose = FALSE,
...
)
Arguments
obs_locs |
Input data frame of presence and absence observations with six (6) features (columns): 1) ID, 2) longitude, 3) latitude, 4) presence/absence binary variable, 5) covariate 1 as x-coordinate, 6) covariate 2 as y-coordinate. |
predict |
Logical. If TRUE, will predict the ecological niche in geographic space. If FALSE (the default), will not predict. |
predict_locs |
Input data frame of prediction locations with 4 features (columns): 1) longitude, 2) latitude, 3) covariate 1 as x-coordinate, 4) covariate 2 as y-coordinate. The covariates must be the same as those included in |
conserve |
Logical. If TRUE (the default), the ecological niche will be estimated within a concave hull around the locations in |
alpha |
Numeric. The two-tailed alpha level for the significance threshold (the default is 0.05). |
p_correct |
Optional. Character string specifying whether to apply a correction for multiple comparisons including a False Discovery Rate |
cv |
Logical. If TRUE, will calculate prediction diagnostics using internal k-fold cross-validation. If FALSE (the default), will not. |
kfold |
Integer. Specify the number of folds used in the internal cross-validation. The default is 10. |
balance |
Logical. If TRUE, the prevalence within each k-fold will be 0.50 by undersampling absence locations (assumes absence data are more frequent). If FALSE (the default), the prevalence within each k-fold will match the prevalence in |
parallel |
Logical. If TRUE, will execute the function in parallel. If FALSE (the default), will not execute the function in parallel. |
n_core |
Optional. Integer specifying the number of CPU cores on the current host for parallelization (the default is 2 cores). |
poly_buffer |
Optional. Specify a custom distance (in the same units as covariates) to add to the window within which the ecological niche is estimated. The default is 1/100th of the smallest range among the two covariates. |
obs_window |
Optional. Specify a custom window of class 'owin' within which to estimate the ecological niche. The default computes a concave hull around the data specified in |
verbose |
Logical. If TRUE (the default), will print function progress during execution. If FALSE, will not print. |
... |
Arguments passed to |
Details
This function estimates the ecological niche of a single species (presence/absence data), or the presence of one species relative to another, using two covariates, will predict the ecological niche into a geographic area and prepare k-fold cross-validation data sets for prediction diagnostics.
The function uses the risk function to estimate the spatial relative risk function and forces risk(tolerate == TRUE) in order to calculate asymptotic p-values. The estimated ecological niche can be visualized using the plot_obs function.
If predict = TRUE, this function will predict ecological niche at every location specified with predict_locs with best performance if predict_locs are gridded locations in the same study area as the observations in obs_locs - a version of environmental interpolation. The predicted spatial distribution of the estimated ecological niche can be visualized using the plot_predict function.
If cv = TRUE, this function will prepare k-fold cross-validation data sets for prediction diagnostics. The sample size of each fold depends on the number of folds set with kfold. If balance = TRUE, the sample size of each fold will be the frequency of presence locations divided by the number of folds times two. If balance = FALSE, the sample size of each fold will be the frequency of all observed locations divided by the number of folds. The cross-validation can be performed in parallel if parallel = TRUE using the future, doFuture, doRNG, and foreach packages. Two diagnostics (area under the receiver operating characteristic curve and precision-recall curve) can be visualized using the plot_cv function.
The obs_window argument may be useful to specify a 'known' window for the ecological niche (e.g., a convex hull around observed locations).
This function has functionality for a correction for multiple testing. If p_correct = "FDR", calculates a False Discovery Rate by Benjamini and Hochberg. If p_correct = "Sidak", calculates a Sidak correction. If p_correct = "Bonferroni", calculates a Bonferroni correction. If p_correct = "none" (the default), then the function does not account for multiple testing and uses the uncorrected alpha level. See the internal pval_correct function documentation for more details.
Value
An object of class 'list'. This is a named list with the following components:
outAn object of class 'list' for the estimated ecological niche.
datAn object of class 'data.frame', returns
obs_locsthat are used in the accompanying plotting functions.p_criticalA numeric value for the critical p-value used for significance tests.
The returned out is a named list with the following components:
obsAn object of class 'rrs' for the spatial relative risk.
presenceAn object of class 'ppp' for the presence locations.
absenceAn object of class 'ppp' for the absence locations.
outer_polyAn object of class 'matrix' for the coordinates of the concave hull around the observation locations.
inner_polyAn object of class 'matrix' for the coordinates of the concave hull around the observation locations. Same as
outer_poly.
If predict = TRUE, the returned out has additional components:
outer_polyAn object of class 'matrix' for the coordinates of the concave hull around the prediction locations.
predictionAn object of class 'matrix' for the coordinates of the concave hull around the prediction locations.
If cv = TRUE, the returned object of class 'list' has an additional named list cv with the following components:
cv_predictions_rrA list of length
kfoldwith values of the log relative risk surface at each point randomly selected in a cross-validation fold.cv_labelsA list of length
kfoldwith a binary value of presence (1) or absence (0) for each point randomly selected in a cross-validation fold.
Examples
if (interactive()) {
set.seed(1234) # for reproducibility
# Using the 'bei' and 'bei.extra' data within {spatstat.data}
# Covariate data (centered and scaled)
elev <- spatstat.data::bei.extra[[1]]
grad <- spatstat.data::bei.extra[[2]]
elev$v <- scale(elev)
grad$v <- scale(grad)
elev_raster <- terra::rast(elev)
grad_raster <- terra::rast(grad)
# Presence data
presence <- spatstat.data::bei
spatstat.geom::marks(presence) <- data.frame("presence" = rep(1, presence$n),
"lon" = presence$x,
"lat" = presence$y)
spatstat.geom::marks(presence)$elev <- elev[presence]
spatstat.geom::marks(presence)$grad <- grad[presence]
# (Pseudo-)Absence data
absence <- spatstat.random::rpoispp(0.008, win = elev)
spatstat.geom::marks(absence) <- data.frame("presence" = rep(0, absence$n),
"lon" = absence$x,
"lat" = absence$y)
spatstat.geom::marks(absence)$elev <- elev[absence]
spatstat.geom::marks(absence)$grad <- grad[absence]
# Combine into readable format
obs_locs <- spatstat.geom::superimpose(presence, absence, check = FALSE)
obs_locs <- spatstat.geom::marks(obs_locs)
obs_locs$id <- seq(1, nrow(obs_locs), 1)
obs_locs <- obs_locs[ , c(6, 2, 3, 1, 4, 5)]
# Prediction Data
predict_xy <- terra::crds(elev_raster)
predict_locs <- as.data.frame(predict_xy)
predict_locs$elev <- terra::extract(elev_raster, predict_xy)[ , 1]
predict_locs$grad <- terra::extract(grad_raster, predict_xy)[ , 1]
# Run lrren
test_lrren <- lrren(obs_locs = obs_locs,
predict_locs = predict_locs,
predict = TRUE,
cv = TRUE)
}