MLTuneParameters {mlexperiments}R Documentation

R6 Class to perform hyperparameter tuning experiments

Description

The MLTuneParameters class is used to construct a parameter tuner object and to perform the tuning of a set of hyperparameters for a specified machine learning algorithm using either a grid search or a Bayesian optimization.

Details

The hyperparameter tuning can be performed with a grid search or a Bayesian optimization. In both cases, each hyperparameter setting is evaluated in a k-fold cross-validation on the dataset specified.

Super classes

mlexperiments::MLBase -> mlexperiments::MLExperimentsBase -> MLTuneParameters

Public fields

parameter_bounds

A named list of tuples to define the parameter bounds of the Bayesian hyperparameter optimization. For further details please see the documentation of the ParBayesianOptimization package.

parameter_grid

A matrix with named columns in which each column represents a parameter that should be optimized and each row represents a specific hyperparameter setting that should be tested throughout the procedure. For strategy = "grid", each row of the parameter_grid is considered as a setting that is evaluated. For strategy = "bayesian", the parameter_grid is passed further on to the initGrid argument of the function ParBayesianOptimization::bayesOpt() in order to initialize the Bayesian process. The maximum rows considered for initializing the Bayesian process can be specified with the R option option("mlexperiments.bayesian.max_init"), which is set to 50L by default.

optim_args

A named list of tuples to define the parameter bounds of the Bayesian hyperparameter optimization. For further details please see the documentation of the ParBayesianOptimization package.

split_type

A character. The splitting strategy to construct the k cross-validation folds. This parameter is passed further on to the function splitTools::create_folds() and defaults to "stratified".

split_vector

A vector If another criteria than the provided y should be considered for generating the cross-validation folds, it can be defined here. It is important, that a vector of the same length as x is provided here.

Methods

Public methods

Inherited methods

Method new()

Create a new MLTuneParameters object.

Usage
MLTuneParameters$new(
  learner,
  seed,
  strategy = c("grid", "bayesian"),
  ncores = -1L
)
Arguments
learner

An initialized learner object that inherits from class "MLLearnerBase".

seed

An integer. Needs to be set for reproducibility purposes.

strategy

A character. The strategy to optimize the hyperparameters (either "grid" or "bayesian").

ncores

An integer to specify the number of cores used for parallelization (default: -1L).

Details

For strategy = "bayesian", the number of starting iterations can be set using the R option "mlexperiments.bayesian.max_init", which defaults to 50L. This option reduces the provided initialization grid to contain at most the specified number of rows. This initialization grid is then further passed on to the initGrid argument of ParBayesianOptimization::bayesOpt.

Returns

A new MLTuneParameters R6 object.

Examples
MLTuneParameters$new(
  learner = LearnerKnn$new(),
  seed = 123,
  strategy = "grid",
  ncores = 2
)


Method execute()

Execute the hyperparameter tuning.

Usage
MLTuneParameters$execute(k)
Arguments
k

An integer to define the number of cross-validation folds used to tune the hyperparameters.

Details

All results of the hyperparameter tuning are saved in the field ⁠$results⁠ of the MLTuneParameters class. After successful execution of the parameter tuning, ⁠$results⁠ contains a list with the items

"summary"

A data.table with the summarized results (same as the returned value of the execute method).

"best.setting"

The best setting (according to the learner's parameter metric_optimization_higher_better) identified during the hyperparameter tuning.

"bayesOpt"

The returned value of ParBayesianOptimization::bayesOpt() (only for strategy = "bayesian").

Returns

A data.table with the results of the hyperparameter optimization. The optimized metric, i.e. the cross-validated evaluation metric is given in the column metric_optim_mean. More results are accessible from the field ⁠$results⁠ of the MLTuneParameters class.

Examples
dataset <- do.call(
  cbind,
  c(sapply(paste0("col", 1:6), function(x) {
    rnorm(n = 500)
    },
    USE.NAMES = TRUE,
    simplify = FALSE
   ),
   list(target = sample(0:1, 500, TRUE))
))
tuner <- MLTuneParameters$new(
  learner = LearnerKnn$new(),
  seed = 123,
  strategy = "grid",
  ncores = 2
)
tuner$parameter_bounds <- list(k = c(2L, 80L))
tuner$parameter_grid <- expand.grid(
  k = seq(4, 68, 8),
  l = 0,
  test = parse(text = "fold_test$x")
)
tuner$split_type <- "stratified"
tuner$optim_args <- list(
  iters.n = 4,
  kappa = 3.5,
  acq = "ucb"
)

# set data
tuner$set_data(
  x = data.matrix(dataset[, -7]),
  y = dataset[, 7]
)

tuner$execute(k = 3)


Method clone()

The objects of this class are cloneable with this method.

Usage
MLTuneParameters$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

See Also

ParBayesianOptimization::bayesOpt(), splitTools::create_folds()

Examples

knn_tuner <- MLTuneParameters$new(
  learner = LearnerKnn$new(),
  seed = 123,
  strategy = "grid",
  ncores = 2
)


## ------------------------------------------------
## Method `MLTuneParameters$new`
## ------------------------------------------------

MLTuneParameters$new(
  learner = LearnerKnn$new(),
  seed = 123,
  strategy = "grid",
  ncores = 2
)


## ------------------------------------------------
## Method `MLTuneParameters$execute`
## ------------------------------------------------

dataset <- do.call(
  cbind,
  c(sapply(paste0("col", 1:6), function(x) {
    rnorm(n = 500)
    },
    USE.NAMES = TRUE,
    simplify = FALSE
   ),
   list(target = sample(0:1, 500, TRUE))
))
tuner <- MLTuneParameters$new(
  learner = LearnerKnn$new(),
  seed = 123,
  strategy = "grid",
  ncores = 2
)
tuner$parameter_bounds <- list(k = c(2L, 80L))
tuner$parameter_grid <- expand.grid(
  k = seq(4, 68, 8),
  l = 0,
  test = parse(text = "fold_test$x")
)
tuner$split_type <- "stratified"
tuner$optim_args <- list(
  iters.n = 4,
  kappa = 3.5,
  acq = "ucb"
)

# set data
tuner$set_data(
  x = data.matrix(dataset[, -7]),
  y = dataset[, 7]
)

tuner$execute(k = 3)


[Package mlexperiments version 0.0.3 Index]