ellipsoid_selection {tenm}R Documentation

ellipsoid_selection: Performs models selection for ellipsoid models

Description

The function performs model selection for ellipsoid models using three criteria: a) the omission rate, b) the significance of partial ROC and binomial tests and c) the AUC value.

Usage

ellipsoid_selection(
  env_train,
  env_test = NULL,
  env_vars,
  nvarstest,
  level = 0.95,
  mve = TRUE,
  env_bg = NULL,
  omr_criteria,
  parallel = FALSE,
  ncores = NULL,
  comp_each = 100,
  proc = FALSE,
  proc_iter = 100,
  rseed = TRUE
)

Arguments

env_train

A data frame with the environmental training data.

env_test

A data frame with the environmental testing data. Default is NULL.

env_vars

A vector with the names of environmental variables used in the selection process. To help choosing which variables to use see correlation_finder.

nvarstest

A vector indicating the number of variables to fit the ellipsoids during model selection.

level

Proportion of points to be included in the ellipsoids, equivalent to the error (E) proposed by Peterson et al. (2008).

mve

Logical. If TRUE, a minimum volume ellipsoid will be computed. using cov.rob from MASS. If FALSE, the covariance matrix of the input data will be used.

env_bg

Environmental data to compute the approximated prevalence of the model, should be a sample of the environmental layers of the calibration area.

omr_criteria

Omission rate criteria: the allowable omission rate for the selection process. Default is NULL (see details).

parallel

Logical. If TRUE, computations will run in parallel. Default is F.

ncores

Number of cores to use for parallel processing. Default uses all available cores minus one.

comp_each

Number of models to run in each job in parallel computation. Default is 100.

proc

Logical. If TRUE, a partial ROC test will be run.

proc_iter

Numeric. Total iterations for the partial ROC bootstrap.

rseed

Logical. If TRUE, set a random seed for partial ROC bootstrap. Default is TRUE.

Details

Model selection occurs in environmental space (E-space). For each variable combination specified in nvarstest, the omission rate (omr) in E-space is computed using inEllipsoid function. Results are ordered by omr of the testing data. If env_bg is provided, an estimated prevalence is computed and results are additionally ordered by partial AUC. Model selection can be run in parallel. For more details and examples go to ellipsoid_omr help.

Value

A data.frame with the following columns:

Author(s)

Luis Osorio-Olvera luismurao@gmail.com

References

Peterson, A.T. et al. (2008) Rethinking receiver operating characteristic analysis applications in ecological niche modeling. Ecol. Modell. 213, 63–72. doi:10.1016/j.ecolmodel.2007.11.008

Examples


library(tenm)
data("abronia")
tempora_layers_dir <- system.file("extdata/bio",package = "tenm")
abt <- tenm::sp_temporal_data(occs = abronia,
                              longitude = "decimalLongitude",
                              latitude = "decimalLatitude",
                              sp_date_var = "year",
                              occ_date_format="y",
                              layers_date_format= "y",
                              layers_by_date_dir = tempora_layers_dir,
                              layers_ext="*.tif$")
abtc <- tenm::clean_dup_by_date(abt,threshold = 10/60)
future::plan("multisession",workers=2)
abex <- tenm::ex_by_date(this_species = abtc,train_prop=0.7)
abbg <- tenm::bg_by_date(this_species = abex,
                         buffer_ngbs=10,n_bg=50000)
future::plan("sequential")
varcorrs <- tenm::correlation_finder(environmental_data =
                                     abex$env_data[,-ncol(abex$env_data)],
                                     method = "spearman",
                                     threshold = 0.8,
                                     verbose = FALSE)
edata <- abex$env_data
etrain <- edata[edata$trian_test=="Train",] |> data.frame()
etest <- edata[edata$trian_test=="Test",] |> data.frame()
bg <- abbg$env_bg
res1 <- tenm::ellipsoid_selection(env_train = etrain,
                                  env_test = etest,
                                  env_vars = varcorrs$descriptors,
                                  nvarstest = 3,
                                  level = 0.975,
                                  mve = TRUE,
                                  env_bg = bg,
                                  omr_criteria = 0.1,
                                  parallel = FALSE,proc = TRUE)
head(res1)



[Package tenm version 0.5.1 Index]