R: R6 Class to perform cross-validation experiments

MLCrossValidation {mlexperiments}

R Documentation

R6 Class to perform cross-validation experiments

Description

The MLCrossValidation class is used to construct a cross validation object and to perform a k-fold cross validation for a specified machine learning algorithm using one distinct hyperparameter setting.

Details

The MLCrossValidation class requires to provide a named list of predefined row indices for the cross validation folds, e.g., created with the function splitTools::create_folds(). This list also defines the k of the k-fold cross-validation. When wanting to perform a repeated k-fold cross validations, just provide a list with all repeated fold definitions, e.g., when specifying the argument m_rep of splitTools::create_folds().

Super classes

mlexperiments::MLBase -> mlexperiments::MLExperimentsBase -> MLCrossValidation

Public fields

fold_list: A named list of predefined row indices for the cross validation folds, e.g., created with the function splitTools::create_folds().
return_models: A logical. If the fitted models should be returned with the results (default: FALSE).
performance_metric: Either a named list with metric functions, a single metric function, or a character vector with metric names from the mlr3measures package. The provided functions must take two named arguments: ground_truth and predictions. For metrics from the mlr3measures package, the wrapper function metric() exists in order to prepare them for use with the mlexperiments package.
performance_metric_args: A list. Further arguments required to compute the performance metric.
predict_args: A list. Further arguments required to compute the predictions.

Methods

Inherited methods

mlexperiments::MLExperimentsBase$set_data()

Method `new()`

Create a new MLCrossValidation object.

Usage

MLCrossValidation$new(
  learner,
  fold_list,
  seed,
  ncores = -1L,
  return_models = FALSE
)

Arguments

learner: An initialized learner object that inherits from class "MLLearnerBase".
fold_list: A named list of predefined row indices for the cross validation folds, e.g., created with the function splitTools::create_folds().
seed: An integer. Needs to be set for reproducibility purposes.
ncores: An integer to specify the number of cores used for parallelization (default: -1L).
return_models: A logical. If the fitted models should be returned with the results (default: FALSE).

Details

The MLCrossValidation class requires to provide a named list of predefined row indices for the cross validation folds, e.g., created with the function splitTools::create_folds(). This list also defines the k of the k-fold cross-validation. When wanting to perform a repeated k-fold cross validations, just provide a list with all repeated fold definitions, e.g., when specifing the argument m_rep of splitTools::create_folds().

Examples

dataset <- do.call(
  cbind,
  c(sapply(paste0("col", 1:6), function(x) {
    rnorm(n = 500)
    },
    USE.NAMES = TRUE,
    simplify = FALSE
   ),
   list(target = sample(0:1, 500, TRUE))
))
fold_list <- splitTools::create_folds(
  y = dataset[, 7],
  k = 3,
  type = "stratified",
  seed = 123
)
cv <- MLCrossValidation$new(
  learner = LearnerKnn$new(),
  fold_list = fold_list,
  seed = 123,
  ncores = 2
)

Method `execute()`

Execute the cross validation.

Usage

MLCrossValidation$execute()

Details

All results of the cross validation are saved in the field ⁠$results⁠ of the MLCrossValidation class. After successful execution of the cross validation, ⁠$results⁠ contains a list with the items:

"fold" A list of folds containing the following items for each cross validation fold:
- "fold_ids" A vector with the utilized in-sample row indices.
- "ground_truth" A vector with the ground truth.
- "predictions" A vector with the predictions.
- "learner.args" A list with the arguments provided to the learner.
- "model" If return_models = TRUE, the fitted model.
"summary" A data.table with the summarized results (same as the returned value of the execute method).
"performance" A list with the value of the performance metric calculated for each of the cross validation folds.

Returns

The function returns a data.table with the results of the cross validation. More results are accessible from the field ⁠$results⁠ of the MLCrossValidation class.

Examples

dataset <- do.call(
  cbind,
  c(sapply(paste0("col", 1:6), function(x) {
    rnorm(n = 500)
    },
    USE.NAMES = TRUE,
    simplify = FALSE
   ),
   list(target = sample(0:1, 500, TRUE))
))
fold_list <- splitTools::create_folds(
  y = dataset[, 7],
  k = 3,
  type = "stratified",
  seed = 123
)
cv <- MLCrossValidation$new(
  learner = LearnerKnn$new(),
  fold_list = fold_list,
  seed = 123,
  ncores = 2
)
cv$learner_args <- list(
  k = 20,
  l = 0,
  test = parse(text = "fold_test$x")
)
cv$predict_args <- list(type = "response")
cv$performance_metric <- metric("bacc")

# set data
cv$set_data(
  x = data.matrix(dataset[, -7]),
  y = dataset[, 7]
)

cv$execute()

Method `clone()`

The objects of this class are cloneable with this method.

Usage

MLCrossValidation$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

Examples

dataset <- do.call(
  cbind,
  c(sapply(paste0("col", 1:6), function(x) {
    rnorm(n = 500)
    },
    USE.NAMES = TRUE,
    simplify = FALSE
   ),
   list(target = sample(0:1, 500, TRUE))
))

fold_list <- splitTools::create_folds(
  y = dataset[, 7],
  k = 3,
  type = "stratified",
  seed = 123
)

cv <- MLCrossValidation$new(
  learner = LearnerKnn$new(),
  fold_list = fold_list,
  seed = 123,
  ncores = 2
)

# learner parameters
cv$learner_args <- list(
  k = 20,
  l = 0,
  test = parse(text = "fold_test$x")
)

# performance parameters
cv$predict_args <- list(type = "response")
cv$performance_metric <- metric("bacc")

# set data
cv$set_data(
  x = data.matrix(dataset[, -7]),
  y = dataset[, 7]
)

cv$execute()


## ------------------------------------------------
## Method `MLCrossValidation$new`
## ------------------------------------------------

dataset <- do.call(
  cbind,
  c(sapply(paste0("col", 1:6), function(x) {
    rnorm(n = 500)
    },
    USE.NAMES = TRUE,
    simplify = FALSE
   ),
   list(target = sample(0:1, 500, TRUE))
))
fold_list <- splitTools::create_folds(
  y = dataset[, 7],
  k = 3,
  type = "stratified",
  seed = 123
)
cv <- MLCrossValidation$new(
  learner = LearnerKnn$new(),
  fold_list = fold_list,
  seed = 123,
  ncores = 2
)


## ------------------------------------------------
## Method `MLCrossValidation$execute`
## ------------------------------------------------

dataset <- do.call(
  cbind,
  c(sapply(paste0("col", 1:6), function(x) {
    rnorm(n = 500)
    },
    USE.NAMES = TRUE,
    simplify = FALSE
   ),
   list(target = sample(0:1, 500, TRUE))
))
fold_list <- splitTools::create_folds(
  y = dataset[, 7],
  k = 3,
  type = "stratified",
  seed = 123
)
cv <- MLCrossValidation$new(
  learner = LearnerKnn$new(),
  fold_list = fold_list,
  seed = 123,
  ncores = 2
)
cv$learner_args <- list(
  k = 20,
  l = 0,
  test = parse(text = "fold_test$x")
)
cv$predict_args <- list(type = "response")
cv$performance_metric <- metric("bacc")

# set data
cv$set_data(
  x = data.matrix(dataset[, -7]),
  y = dataset[, 7]
)

cv$execute()

[Package mlexperiments version 0.0.4 Index]

R6 Class to perform cross-validation experiments

Description

Details

Super classes

Public fields

Methods

Public methods

Method new()

Usage

Arguments

Details

Examples

Method execute()

Usage

Details

Returns

Examples

Method clone()

Usage

Arguments

See Also

Examples

Method `new()`

Method `execute()`

Method `clone()`