BIOMOD_ModelingOptions {biomod2} | R Documentation |
Configure the modeling options for each selected model
Description
Parametrize and/or tune biomod2's single models options.
Usage
BIOMOD_ModelingOptions(
GLM = NULL,
GBM = NULL,
GAM = NULL,
CTA = NULL,
ANN = NULL,
SRE = NULL,
FDA = NULL,
MARS = NULL,
RF = NULL,
MAXENT = NULL,
XGBOOST = NULL
)
bm_DefaultModelingOptions()
Arguments
GLM |
(optional, default |
GBM |
(optional, default |
GAM |
(optional, default |
CTA |
(optional, default |
ANN |
(optional, default |
SRE |
(optional, default |
FDA |
(optional, default |
MARS |
(optional, default |
RF |
(optional, default |
MAXENT |
(optional, default |
XGBOOST |
(optional, default |
Details
This function allows advanced user to change some default parameters of biomod2 inner
models.
10 single models are available within the package, and their options can be set
with this function through list
objects.
The bm_DefaultModelingOptions
function prints all default parameter values for
all available models.
This output can be copied and pasted to be used as is (with wanted
changes) as function arguments (see Examples).
Below is the detailed list of all modifiable parameters for each available model.
Value
A BIOMOD.models.options
object that can be used to build species distribution
model(s) with the BIOMOD_Modeling
function.
GLM
(glm
)
myFormula
: a typicalformula
object (see Examples).
If notNULL
,type
andinteraction.level
parameters are switched off.
You can choose to either :generate automatically the GLM formula with the following parameters :
type = 'quadratic'
: formula given to the model, must besimple
,quadratic
orpolynomial
interaction.level = 0
: aninteger
corresponding to the interaction level between considered variables considered (be aware that interactions quickly enlarge the number of effective variables used into the GLM !)
or construct specific formula
test = 'AIC'
: information criteria for the stepwise selection procedure, must beAIC
(Akaike Information Criteria,BIC
(Bayesian Information Criteria) ornone
(consider only the full model, no stepwise selection, but this can lead to convergence issue and strange results !)family = binomial(link = 'logit')
: acharacter
defining the error distribution and link function to be used in the model, mus be a family name, a family function or the result of a call to a family function (see family) (so far, biomod2 only runs on presence-absence data, so binomial family is the default !)control
: alist
of parameters to control the fitting process (passed toglm.control
)
GBM
(default gbm
)
Please refer to gbm
help file for more details.
distribution = 'bernoulli'
n.trees = 2500
interaction.depth = 7
n.minobsinnode = 5
shrinkage = 0.001
bag.fraction = 0.5
train.fraction = 1
cv.folds = 3
keep.data = FALSE
verbose = FALSE
perf.method = 'cv'
n.cores = 1
GAM
algo = 'GAM_gam'
: acharacter
defining the chosen GAM function, must beGAM_gam
(seegam
),GAM_mgcv
(seegam
) orBAM_mgcv
(seebam
)myFormula
: a typicalformula
object (see Examples).
If notNULL
,type
andinteraction.level
parameters are switched off.
You can choose to either :generate automatically the GAM formula with the following parameters :
type = 's_smoother'
: the smoother used to generate the formulainteraction.level = 0
: aninteger
corresponding to the interaction level between considered variables considered (be aware that interactions quickly enlarge the number of effective variables used into the GLM !)
or construct specific formula
k = -1
a smooth term in a formula argument to gam, must be-1
or4
(see gams
or mgcvs
)family = binomial(link = 'logit')
: acharacter
defining the error distribution and link function to be used in the model, mus be a family name, a family function or the result of a call to a family function (see family) (so far, biomod2 only runs on presence-absence data, so binomial family is the default !)control
: alist
of parameters to control the fitting process (passed togam.control
orgam.control
)some options specific to
GAM_mgcv
(ignored ifalgo = 'GAM_gam'
)method = 'GCV.Cp'
)optimizer = c('outer','newton')
select = FALSE
knots = NULL
paramPen = NULL
CTA
(rpart
)
Please refer to rpart
help file for more details.
method = 'class'
parms = 'default'
: if'default'
, default rpartparms
value are keptcost = NULL
control
: seerpart.control
ANN
(nnet
)
NbCV = 5
: aninteger
corresponding to the number of cross-validation repetitions to find best size and decay parameterssize = NULL
: aninteger
corresponding to the number of units in the hidden layer. IfNULL
then size parameter will be optimized by cross-validation based on model AUC (NbCv
cross-validations ; tested size will be the following :c(2, 4, 6, 8)
). It is also possible to give avector
of size values to be tested, and the one giving the best model AUC will be kept.decay = NULL
: anumeric
corresponding to weight decay. IfNULL
then decay parameter will be optimized by cross-validation based on model AUC (NbCv
cross-validations ; tested size will be the following :c(0.001, 0.01, 0.05, 0.1)
). It is also possible to give avector
of decay values to be tested, and the one giving the best model AUC will be kept.rang = 0.1
: anumeric
corresponding to the initial random weights on[-rang, rang]
maxit = 200
: aninteger
corresponding to the maximum number of iterations
SRE
(bm_SRE
)
quant = 0.025
: anumeric
corresponding to the quantile of 'extreme environmental variable' removed to select species envelops
FDA
(fda
)
Please refer to fda
help file for more details.
method = 'mars'
add_args = NULL
: alist
of additional parameters tomethod
and given to the...
options offda
function
MARS
(earth
)
Please refer to earth
help file for more details.
myFormula
: a typicalformula
object (see Examples).
If notNULL
,type
andinteraction.level
parameters are switched off.
You can choose to either :generate automatically the MARS formula with the following parameters :
type = 'simple'
: formula given to the model, must besimple
,quadratic
orpolynomial
interaction.level = 0
: aninteger
corresponding to the interaction level between considered variables considered (be aware that interactions quickly enlarge the number of effective variables used into the MARS !)
or construct specific formula
nk = NULL
: aninteger
corresponding to the maximum number of model terms.
IfNULL
default MARS function value is used :max(21, 2 * nb_expl_var + 1)
penalty = 2
thresh = 0.001
nprune = NULL
pmethod = 'backward'
RF
do.classif = TRUE
: ifTRUE
random.forest classification will be computed, otherwise random.forest regression will be donentree = 500
mtry = 'default'
sampsize = NULL
nodesize = 5
maxnodes = NULL
MAXENT
(https://biodiversityinformatics.amnh.org/open_source/maxent/)
path_to_maxent.jar = getwd()
: acharacter
corresponding to maxent.jar file linkmemory_allocated = 512
: aninteger
corresponding to the amount of memory (in Mo) reserved forjava
to runMAXENT
, must be64
,128
,256
,512
,1024
... orNULL
to use defaultjava
memory limitation parameterinitial_heap_size = NULL
: acharacter
initial heap space (shared memory space) allocated to java. Argument transmitted to-Xms
when calling java. Used inBIOMOD_Projection
but not inBIOMOD_Modeling
. Values can be1024K
,4096M
,10G
... orNULL
to use defaultjava
parametermax_heap_size = NULL
: acharacter
initial heap space (shared memory space) allocated to java. Argument transmitted to-Xmx
when calling java. Used inBIOMOD_Projection
but not inBIOMOD_Modeling
. Must be larger thaninitial_heap_size
. Values can be1024K
,4096M
,10G
... orNULL
to use defaultjava
parameterbackground_data_dir
: acharacter
corresponding to directory path where explanatory variables are stored asASCII
files (raster format). If specified,MAXENT
will generate its own background data from explanatory variables rasters (as usually done inMAXENT
studies). Otherwise biomod2 pseudo-absences will be used (seeBIOMOD_FormatingData
)maximumbackground
: aninteger
corresponding to the maximum number of background data to sample if thebackground_data_dir
parameter has been setmaximumiterations = 200
: aninteger
corresponding to the maximum number of iterations to dovisible = FALSE
: alogical
to make theMAXENT
user interface availablelinear = TRUE
: alogical
to allow linear features to be usedquadratic = TRUE
: alogical
to allow quadratic features to be usedproduct = TRUE
: alogical
to allow product features to be usedthreshold = TRUE
: alogical
to allow threshold features to be usedhinge = TRUE
: alogical
to allow hinge features to be usedlq2lqptthreshold = 80
: aninteger
corresponding to the number of samples at which product and threshold features start being usedl2lqthreshold = 10
: aninteger
corresponding to the number of samples at which quadratic features start being usedhingethreshold = 15
: aninteger
corresponding to the number of samples at which hinge features start being usedbeta_threshold = -1.0
: anumeric
corresponding to the regularization parameter to be applied to all threshold features (negative value enables automatic setting)beta_categorical = -1.0
: anumeric
corresponding to the regularization parameter to be applied to all categorical features (negative value enables automatic setting)beta_lqp = -1.0
: anumeric
corresponding to the regularization parameter to be applied to all linear, quadratic and product features (negative value enables automatic setting)beta_hinge = -1.0
: anumeric
corresponding to the regularization parameter to be applied to all hinge features (negative value enables automatic setting)betamultiplier = 1
: anumeric
to multiply all automatic regularization parameters
(higher number gives a more spread-out distribution)defaultprevalence = 0.5
: anumeric
corresponding to the default prevalence of the species
(probability of presence at ordinary occurrence points)
XGBOOST
(default xgboost
)
Please refer to xgboost
help file for more details.
max.depth = 5
eta = 0.1
nrounds = 512
objective = "binary:logistic"
nthread = 1
Author(s)
Damien Georges, Wilfried Thuiller
See Also
BIOMOD_Tuning
, BIOMOD_Modeling
Other Main functions:
BIOMOD_EnsembleForecasting()
,
BIOMOD_EnsembleModeling()
,
BIOMOD_FormatingData()
,
BIOMOD_LoadModels()
,
BIOMOD_Modeling()
,
BIOMOD_PresenceOnly()
,
BIOMOD_Projection()
,
BIOMOD_RangeSize()
,
BIOMOD_Tuning()
Examples
library(terra)
# Load species occurrences (6 species available)
data(DataSpecies)
head(DataSpecies)
# Select the name of the studied species
myRespName <- 'GuloGulo'
# Get corresponding presence/absence data
myResp <- as.numeric(DataSpecies[, myRespName])
# Get corresponding XY coordinates
myRespXY <- DataSpecies[, c('X_WGS84', 'Y_WGS84')]
# Load environmental variables extracted from BIOCLIM (bio_3, bio_4, bio_7, bio_11 & bio_12)
data(bioclim_current)
myExpl <- terra::rast(bioclim_current)
# ---------------------------------------------------------------#
# Print default modeling options
bm_DefaultModelingOptions()
# Create default modeling options
myBiomodOptions <- BIOMOD_ModelingOptions()
myBiomodOptions
# # Part (or totality) of the print can be copied and customized
# # Below is an example to compute quadratic GLM and select best model with 'BIC' criterium
# myBiomodOptions <- BIOMOD_ModelingOptions(
# GLM = list(type = 'quadratic',
# interaction.level = 0,
# myFormula = NULL,
# test = 'BIC',
# family = 'binomial',
# control = glm.control(epsilon = 1e-08,
# maxit = 1000,
# trace = FALSE)))
# myBiomodOptions
#
# # It is also possible to give a specific GLM formula
# myForm <- 'Sp277 ~ bio3 + log(bio10) + poly(bio16, 2) + bio19 + bio3:bio19'
# myBiomodOptions <- BIOMOD_ModelingOptions(GLM = list(myFormula = formula(myForm)))
# myBiomodOptions