| MLCrossValidation {mlexperiments} | R Documentation |
R6 Class to perform cross-validation experiments
Description
The MLCrossValidation class is used to construct a cross validation object
and to perform a k-fold cross validation for a specified machine learning
algorithm using one distinct hyperparameter setting.
Details
The MLCrossValidation class requires to provide a named list of predefined
row indices for the cross validation folds, e.g., created with the function
splitTools::create_folds(). This list also defines the k of the k-fold
cross-validation. When wanting to perform a repeated k-fold cross
validations, just provide a list with all repeated fold definitions, e.g.,
when specifying the argument m_rep of splitTools::create_folds().
Super classes
mlexperiments::MLBase -> mlexperiments::MLExperimentsBase -> MLCrossValidation
Public fields
fold_listA named list of predefined row indices for the cross validation folds, e.g., created with the function
splitTools::create_folds().return_modelsA logical. If the fitted models should be returned with the results (default:
FALSE).performance_metricEither a named list with metric functions, a single metric function, or a character vector with metric names from the
mlr3measurespackage. The provided functions must take two named arguments:ground_truthandpredictions. For metrics from themlr3measurespackage, the wrapper functionmetric()exists in order to prepare them for use with themlexperimentspackage.performance_metric_argsA list. Further arguments required to compute the performance metric.
predict_argsA list. Further arguments required to compute the predictions.
Methods
Public methods
Inherited methods
Method new()
Create a new MLCrossValidation object.
Usage
MLCrossValidation$new( learner, fold_list, seed, ncores = -1L, return_models = FALSE )
Arguments
learnerAn initialized learner object that inherits from class
"MLLearnerBase".fold_listA named list of predefined row indices for the cross validation folds, e.g., created with the function
splitTools::create_folds().seedAn integer. Needs to be set for reproducibility purposes.
ncoresAn integer to specify the number of cores used for parallelization (default:
-1L).return_modelsA logical. If the fitted models should be returned with the results (default:
FALSE).
Details
The MLCrossValidation class requires to provide a named list of
predefined row indices for the cross validation folds, e.g., created
with the function splitTools::create_folds(). This list also defines
the k of the k-fold cross-validation. When wanting to perform a
repeated k-fold cross validations, just provide a list with all
repeated fold definitions, e.g., when specifing the argument m_rep of
splitTools::create_folds().
Examples
dataset <- do.call(
cbind,
c(sapply(paste0("col", 1:6), function(x) {
rnorm(n = 500)
},
USE.NAMES = TRUE,
simplify = FALSE
),
list(target = sample(0:1, 500, TRUE))
))
fold_list <- splitTools::create_folds(
y = dataset[, 7],
k = 3,
type = "stratified",
seed = 123
)
cv <- MLCrossValidation$new(
learner = LearnerKnn$new(),
fold_list = fold_list,
seed = 123,
ncores = 2
)
Method execute()
Execute the cross validation.
Usage
MLCrossValidation$execute()
Details
All results of the cross validation are saved in the field
$results of the MLCrossValidation class. After successful execution
of the cross validation, $results contains a list with the items:
"fold" A list of folds containing the following items for each cross validation fold:
"fold_ids" A vector with the utilized in-sample row indices.
"ground_truth" A vector with the ground truth.
"predictions" A vector with the predictions.
"learner.args" A list with the arguments provided to the learner.
"model" If
return_models = TRUE, the fitted model.
"summary" A data.table with the summarized results (same as the returned value of the
executemethod)."performance" A list with the value of the performance metric calculated for each of the cross validation folds.
Returns
The function returns a data.table with the results of the cross
validation. More results are accessible from the field $results of
the MLCrossValidation class.
Examples
dataset <- do.call(
cbind,
c(sapply(paste0("col", 1:6), function(x) {
rnorm(n = 500)
},
USE.NAMES = TRUE,
simplify = FALSE
),
list(target = sample(0:1, 500, TRUE))
))
fold_list <- splitTools::create_folds(
y = dataset[, 7],
k = 3,
type = "stratified",
seed = 123
)
cv <- MLCrossValidation$new(
learner = LearnerKnn$new(),
fold_list = fold_list,
seed = 123,
ncores = 2
)
cv$learner_args <- list(
k = 20,
l = 0,
test = parse(text = "fold_test$x")
)
cv$predict_args <- list(type = "response")
cv$performance_metric <- metric("bacc")
# set data
cv$set_data(
x = data.matrix(dataset[, -7]),
y = dataset[, 7]
)
cv$execute()
Method clone()
The objects of this class are cloneable with this method.
Usage
MLCrossValidation$clone(deep = FALSE)
Arguments
deepWhether to make a deep clone.
See Also
splitTools::create_folds(), mlr3measures::measures,
metric()
Examples
dataset <- do.call(
cbind,
c(sapply(paste0("col", 1:6), function(x) {
rnorm(n = 500)
},
USE.NAMES = TRUE,
simplify = FALSE
),
list(target = sample(0:1, 500, TRUE))
))
fold_list <- splitTools::create_folds(
y = dataset[, 7],
k = 3,
type = "stratified",
seed = 123
)
cv <- MLCrossValidation$new(
learner = LearnerKnn$new(),
fold_list = fold_list,
seed = 123,
ncores = 2
)
# learner parameters
cv$learner_args <- list(
k = 20,
l = 0,
test = parse(text = "fold_test$x")
)
# performance parameters
cv$predict_args <- list(type = "response")
cv$performance_metric <- metric("bacc")
# set data
cv$set_data(
x = data.matrix(dataset[, -7]),
y = dataset[, 7]
)
cv$execute()
## ------------------------------------------------
## Method `MLCrossValidation$new`
## ------------------------------------------------
dataset <- do.call(
cbind,
c(sapply(paste0("col", 1:6), function(x) {
rnorm(n = 500)
},
USE.NAMES = TRUE,
simplify = FALSE
),
list(target = sample(0:1, 500, TRUE))
))
fold_list <- splitTools::create_folds(
y = dataset[, 7],
k = 3,
type = "stratified",
seed = 123
)
cv <- MLCrossValidation$new(
learner = LearnerKnn$new(),
fold_list = fold_list,
seed = 123,
ncores = 2
)
## ------------------------------------------------
## Method `MLCrossValidation$execute`
## ------------------------------------------------
dataset <- do.call(
cbind,
c(sapply(paste0("col", 1:6), function(x) {
rnorm(n = 500)
},
USE.NAMES = TRUE,
simplify = FALSE
),
list(target = sample(0:1, 500, TRUE))
))
fold_list <- splitTools::create_folds(
y = dataset[, 7],
k = 3,
type = "stratified",
seed = 123
)
cv <- MLCrossValidation$new(
learner = LearnerKnn$new(),
fold_list = fold_list,
seed = 123,
ncores = 2
)
cv$learner_args <- list(
k = 20,
l = 0,
test = parse(text = "fold_test$x")
)
cv$predict_args <- list(type = "response")
cv$performance_metric <- metric("bacc")
# set data
cv$set_data(
x = data.matrix(dataset[, -7]),
y = dataset[, 7]
)
cv$execute()