MLTuneParameters {mlexperiments} | R Documentation |
R6 Class to perform hyperparameter tuning experiments
Description
The MLTuneParameters
class is used to construct a parameter tuner object
and to perform the tuning of a set of hyperparameters for a specified
machine learning algorithm using either a grid search or a Bayesian
optimization.
Details
The hyperparameter tuning can be performed with a grid search or a Bayesian optimization. In both cases, each hyperparameter setting is evaluated in a k-fold cross-validation on the dataset specified.
Super classes
mlexperiments::MLBase
-> mlexperiments::MLExperimentsBase
-> MLTuneParameters
Public fields
parameter_bounds
A named list of tuples to define the parameter bounds of the Bayesian hyperparameter optimization. For further details please see the documentation of the
ParBayesianOptimization
package.parameter_grid
A matrix with named columns in which each column represents a parameter that should be optimized and each row represents a specific hyperparameter setting that should be tested throughout the procedure. For
strategy = "grid"
, each row of theparameter_grid
is considered as a setting that is evaluated. Forstrategy = "bayesian"
, theparameter_grid
is passed further on to theinitGrid
argument of the functionParBayesianOptimization::bayesOpt()
in order to initialize the Bayesian process. The maximum rows considered for initializing the Bayesian process can be specified with the R optionoption("mlexperiments.bayesian.max_init")
, which is set to50L
by default.optim_args
A named list of tuples to define the parameter bounds of the Bayesian hyperparameter optimization. For further details please see the documentation of the
ParBayesianOptimization
package.split_type
A character. The splitting strategy to construct the k cross-validation folds. This parameter is passed further on to the function
splitTools::create_folds()
and defaults to"stratified"
.split_vector
A vector If another criteria than the provided
y
should be considered for generating the cross-validation folds, it can be defined here. It is important, that a vector of the same length asx
is provided here.
Methods
Public methods
Inherited methods
Method new()
Create a new MLTuneParameters
object.
Usage
MLTuneParameters$new( learner, seed, strategy = c("grid", "bayesian"), ncores = -1L )
Arguments
learner
An initialized learner object that inherits from class
"MLLearnerBase"
.seed
An integer. Needs to be set for reproducibility purposes.
strategy
A character. The strategy to optimize the hyperparameters (either
"grid"
or"bayesian"
).ncores
An integer to specify the number of cores used for parallelization (default:
-1L
).
Details
For strategy = "bayesian"
, the number of starting iterations can be
set using the R option "mlexperiments.bayesian.max_init"
, which
defaults to 50L
. This option reduces the provided initialization
grid to contain at most the specified number of rows. This
initialization grid is then further passed on to the initGrid
argument of ParBayesianOptimization::bayesOpt.
Returns
A new MLTuneParameters
R6 object.
Examples
MLTuneParameters$new( learner = LearnerKnn$new(), seed = 123, strategy = "grid", ncores = 2 )
Method execute()
Execute the hyperparameter tuning.
Usage
MLTuneParameters$execute(k)
Arguments
k
An integer to define the number of cross-validation folds used to tune the hyperparameters.
Details
All results of the hyperparameter tuning are saved in the field
$results
of the MLTuneParameters
class. After successful execution
of the parameter tuning, $results
contains a list with the items
- "summary"
A data.table with the summarized results (same as the returned value of the
execute
method).- "best.setting"
The best setting (according to the learner's parameter
metric_optimization_higher_better
) identified during the hyperparameter tuning.- "bayesOpt"
The returned value of
ParBayesianOptimization::bayesOpt()
(only forstrategy = "bayesian"
).
Returns
A data.table
with the results of the hyperparameter
optimization. The optimized metric, i.e. the cross-validated evaluation
metric is given in the column metric_optim_mean
. More results are
accessible from the field $results
of the MLTuneParameters
class.
Examples
dataset <- do.call( cbind, c(sapply(paste0("col", 1:6), function(x) { rnorm(n = 500) }, USE.NAMES = TRUE, simplify = FALSE ), list(target = sample(0:1, 500, TRUE)) )) tuner <- MLTuneParameters$new( learner = LearnerKnn$new(), seed = 123, strategy = "grid", ncores = 2 ) tuner$parameter_bounds <- list(k = c(2L, 80L)) tuner$parameter_grid <- expand.grid( k = seq(4, 68, 8), l = 0, test = parse(text = "fold_test$x") ) tuner$split_type <- "stratified" tuner$optim_args <- list( iters.n = 4, kappa = 3.5, acq = "ucb" ) # set data tuner$set_data( x = data.matrix(dataset[, -7]), y = dataset[, 7] ) tuner$execute(k = 3)
Method clone()
The objects of this class are cloneable with this method.
Usage
MLTuneParameters$clone(deep = FALSE)
Arguments
deep
Whether to make a deep clone.
See Also
ParBayesianOptimization::bayesOpt()
, splitTools::create_folds()
Examples
knn_tuner <- MLTuneParameters$new(
learner = LearnerKnn$new(),
seed = 123,
strategy = "grid",
ncores = 2
)
## ------------------------------------------------
## Method `MLTuneParameters$new`
## ------------------------------------------------
MLTuneParameters$new(
learner = LearnerKnn$new(),
seed = 123,
strategy = "grid",
ncores = 2
)
## ------------------------------------------------
## Method `MLTuneParameters$execute`
## ------------------------------------------------
dataset <- do.call(
cbind,
c(sapply(paste0("col", 1:6), function(x) {
rnorm(n = 500)
},
USE.NAMES = TRUE,
simplify = FALSE
),
list(target = sample(0:1, 500, TRUE))
))
tuner <- MLTuneParameters$new(
learner = LearnerKnn$new(),
seed = 123,
strategy = "grid",
ncores = 2
)
tuner$parameter_bounds <- list(k = c(2L, 80L))
tuner$parameter_grid <- expand.grid(
k = seq(4, 68, 8),
l = 0,
test = parse(text = "fold_test$x")
)
tuner$split_type <- "stratified"
tuner$optim_args <- list(
iters.n = 4,
kappa = 3.5,
acq = "ucb"
)
# set data
tuner$set_data(
x = data.matrix(dataset[, -7]),
y = dataset[, 7]
)
tuner$execute(k = 3)