train {SDMtune} | R Documentation |
Train
Description
Train a model using one of the following methods: Artificial Neural Networks, Boosted Regression Trees, Maxent, Maxnet or Random Forest.
Usage
train(method, data, folds = NULL, progress = TRUE, ...)
Arguments
method |
character or character vector. Method used to train the model, possible values are "ANN", "BRT", "Maxent", "Maxnet" or "RF", see details. |
data |
SWD object with presence and absence/background locations. |
folds |
list. Output of the function randomFolds or folds object created with other packages, see details. |
progress |
logical. If |
... |
Arguments passed to the relative method, see details. |
Details
For the ANN method possible arguments are (for more details see nnet):
size: integer. Number of the units in the hidden layer.
decay numeric. Weight decay, default is 0.
rang numeric. Initial random weights, default is 0.7.
maxit integer. Maximum number of iterations, default is 100.
For the BRT method possible arguments are (for more details see gbm):
distribution: character. Name of the distribution to use, default is "bernoulli".
n.trees: integer. Maximum number of tree to grow, default is 100.
interaction.depth: integer. Maximum depth of each tree, default is 1.
shrinkage: numeric. The shrinkage parameter, default is 0.1.
bag.fraction: numeric. Random fraction of data used in the tree expansion, default is 0.5.
For the RF method the model is trained as classification. Possible arguments are (for more details see randomForest):
mtry: integer. Number of variable randomly sampled at each split, default is
floor(sqrt(number of variables))
.ntree: integer. Number of tree to grow, default is 500.
nodesize: integer. Minimum size of terminal nodes, default is 1.
Maxent models are trained using the arguments
"removeduplicates=false"
and"addsamplestobackground=false"
. Use the function thinData to remove duplicates and the function addSamplesToBg to add presence locations to background locations. For the Maxent method, possible arguments are:reg: numeric. The value of the regularization multiplier, default is 1.
fc: character. The value of the feature classes, possible values are combinations of "l", "q", "p", "h" and "t", default is "lqph".
iter: numeric. Number of iterations used by the MaxEnt algorithm, default is 500.
Maxnet models are trained using the argument
"addsamplestobackground = FALSE"
, use the function addSamplesToBg to add presence locations to background locations. For the Maxnet method, possible arguments are (for more details see maxnet):reg: numeric. The value of the regularization intensity, default is 1.
fc: character. The value of the feature classes, possible values are combinations of "l", "q", "p", "h" and "t", default is "lqph".
The folds argument accepts also objects created with other packages: ENMeval or blockCV. In this case the function converts internally the folds into a format valid for SDMtune.
When multiple methods are given as method
argument, the function returns a
named list of model object, with the name corresponding to the used method,
see examples.
Value
An SDMmodel or SDMmodelCV or a list of model objects.
Author(s)
Sergio Vignali
References
Venables, W. N. & Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth Edition. Springer, New York. ISBN 0-387-95457-0.
Brandon Greenwell, Bradley Boehmke, Jay Cunningham and GBM Developers (2019). gbm: Generalized Boosted Regression Models. https://CRAN.R-project.org/package=gbm.
A. Liaw and M. Wiener (2002). Classification and Regression by randomForest. R News 2(3), 18–22.
Hijmans, Robert J., Steven Phillips, John Leathwick, and Jane Elith. 2017. dismo: Species Distribution Modeling. https://cran.r-project.org/package=dismo.
Steven Phillips (2017). maxnet: Fitting 'Maxent' Species Distribution Models with 'glmnet'. https://CRAN.R-project.org/package=maxnet.
Muscarella, R., Galante, P.J., Soley-Guardia, M., Boria, R.A., Kass, J., Uriarte, M. and R.P. Anderson (2014). ENMeval: An R package for conducting spatially independent evaluations and estimating optimal model complexity for ecological niche models. Methods in Ecology and Evolution.
Roozbeh Valavi, Jane Elith, José Lahoz-Monfort and Gurutzeta Guillera-Arroita (2018). blockCV: Spatial and environmental blocking for k-fold cross-validation. https://github.com/rvalavi/blockCV.
See Also
Examples
# Acquire environmental variables
files <- list.files(path = file.path(system.file(package = "dismo"), "ex"),
pattern = "grd",
full.names = TRUE)
predictors <- terra::rast(files)
# Prepare presence and background locations
p_coords <- virtualSp$presence
bg_coords <- virtualSp$background
# Create SWD object
data <- prepareSWD(species = "Virtual species",
p = p_coords,
a = bg_coords,
env = predictors,
categorical = "biome")
## Train a Maxent model
model <- train(method = "Maxent",
data = data,
fc = "l",
reg = 1.5,
iter = 700)
# Add samples to background. This should be done preparing the data before
# training the model without using
data <- addSamplesToBg(data)
model <- train("Maxent",
data = data)
## Train a Maxnet model
model <- train(method = "Maxnet",
data = data,
fc = "lq",
reg = 1.5)
## Cross Validation
# Create 4 random folds splitting only the presence data
folds <- randomFolds(data,
k = 4,
only_presence = TRUE)
model <- train(method = "Maxnet",
data = data,
fc = "l",
reg = 0.8,
folds = folds)
## Not run:
# Run only if you have the package ENMeval installed
## Block partition using the ENMeval package
require(ENMeval)
block_folds <- get.block(occ = data@coords[data@pa == 1, ],
bg.coords = data@coords[data@pa == 0, ])
model <- train(method = "Maxnet",
data = data,
fc = "l",
reg = 0.8,
folds = block_folds)
## Checkerboard1 partition using the ENMeval package
cb_folds <- get.checkerboard1(occ = data@coords[data@pa == 1, ],
env = predictors,
bg.coords = data@coords[data@pa == 0, ],
aggregation.factor = 4)
model <- train(method = "Maxnet",
data = data,
fc = "l",
reg = 0.8,
folds = cb_folds)
## Environmental block using the blockCV package
# Run only if you have the package blockCV
require(blockCV)
# Create sf object
sf_df <- sf::st_as_sf(cbind(data@coords, pa = data@pa),
coords = c("X", "Y"),
crs = terra::crs(predictors,
proj = TRUE))
# Spatial blocks
spatial_folds <- cv_spatial(x = sf_df,
column = "pa",
rows_cols = c(8, 10),
k = 5,
hexagon = FALSE,
selection = "systematic")
model <- train(method = "Maxnet",
data = data,
fc = "l",
reg = 0.8,
folds = spatial_folds)
## End(Not run)
## Train presence absence models
# Prepare presence and absence locations
p_coords <- virtualSp$presence
a_coords <- virtualSp$absence
# Create SWD object
data <- prepareSWD(species = "Virtual species",
p = p_coords,
a = a_coords,
env = predictors[[1:5]])
## Train an Artificial Neural Network model
model <- train("ANN",
data = data,
size = 10)
## Train a Random Forest model
model <- train("RF",
data = data,
ntree = 300)
## Train a Boosted Regression Tree model
model <- train("BRT",
data = data,
n.trees = 300,
shrinkage = 0.001)
## Multiple methods trained together with default arguments
output <- train(method = c("ANN", "BRT", "RF"),
data = data,
size = 10)
output$ANN
output$BRT
output$RF
## Multiple methods trained together passing extra arguments
output <- train(method = c("ANN", "BRT", "RF"),
data = data,
size = 10,
ntree = 300,
n.trees = 300,
shrinkage = 0.001)
output