rf_tuning {spatialRF} | R Documentation |
Tuning of random forest hyperparameters via spatial cross-validation
Description
Finds the optimal set of random forest hyperparameters num.trees
, mtry
, and min.node.size
via grid search by maximizing the model's R squared, or AUC, if the response variable is binomial, via spatial cross-validation performed with rf_evaluate()
.
Usage
rf_tuning(
model = NULL,
num.trees = NULL,
mtry = NULL,
min.node.size = NULL,
xy = NULL,
repetitions = 30,
training.fraction = 0.75,
seed = 1,
verbose = TRUE,
n.cores = parallel::detectCores() - 1,
cluster = NULL
)
Arguments
model |
A model fitted with |
num.trees |
Numeric integer vector with the number of trees to fit on each model repetition. Default: |
mtry |
Numeric integer vector, number of predictors to randomly select from the complete pool of predictors on each tree split. Default: |
min.node.size |
Numeric integer, minimal number of cases in a terminal node. Default: |
xy |
Data frame or matrix with two columns containing coordinates and named "x" and "y". If |
repetitions |
Integer, number of independent spatial folds to use during the cross-validation. Default: |
training.fraction |
Proportion between 0.2 and 0.9 indicating the number of records to be used in model training. Default: |
seed |
Integer, random seed to facilitate reproduciblity. If set to a given number, the results of the function are always the same. Default: |
verbose |
Logical. If TRUE, messages and plots generated during the execution of the function are displayed, Default: |
n.cores |
Integer, number of cores to use for parallel execution. Creates a socket cluster with |
cluster |
A cluster definition generated with |
Value
A model with a new slot named tuning
, with a data frame with the results of the tuning analysis.
See Also
Examples
if(interactive()){
#loading example data
data(plant_richness_df)
data(distance_matrix)
#fitting model to tune
out <- rf(
data = plant_richness_df,
dependent.variable.name = "richness_species_vascular",
predictor.variable.names = colnames(plant_richness_df)[5:21],
distance.matrix = distance_matrix,
distance.thresholds = 0,
n.cores = 1
)
#model tuning
tuning <- rf_tuning(
model = out,
num.trees = c(100, 500),
mtry = c(2, 8),
min.node.size = c(5, 10),
xy = plant_richness_df[, c("x", "y")],
n.cores = 1
)
}