benchmark {mlr3} | R Documentation |
Benchmark Multiple Learners on Multiple Tasks
Description
Runs a benchmark on arbitrary combinations of tasks (Task), learners (Learner), and resampling strategies (Resampling), possibly in parallel.
Usage
benchmark(
design,
store_models = FALSE,
store_backends = TRUE,
encapsulate = NA_character_,
allow_hotstart = FALSE,
clone = c("task", "learner", "resampling"),
unmarshal = TRUE
)
Arguments
design |
( |
store_models |
( |
store_backends |
( |
encapsulate |
( |
allow_hotstart |
( |
clone |
( |
unmarshal |
|
Value
Predict Sets
If you want to compare the performance of a learner on the training with the performance
on the test set, you have to configure the Learner to predict on multiple sets by
setting the field predict_sets
to c("train", "test")
(default is "test"
).
Each set yields a separate Prediction object during resampling.
In the next step, you have to configure the measures to operate on the respective Prediction object:
m1 = msr("classif.ce", id = "ce.train", predict_sets = "train") m2 = msr("classif.ce", id = "ce.test", predict_sets = "test")
The (list of) created measures can finally be passed to $aggregate()
or $score()
.
Parallelization
This function can be parallelized with the future package.
One job is one resampling iteration, and all jobs are send to an apply function
from future.apply in a single batch.
To select a parallel backend, use future::plan()
.
Progress Bars
This function supports progress bars via the package progressr.
Simply wrap the function call in progressr::with_progress()
to enable them.
Alternatively, call progressr::handlers()
with global = TRUE
to enable progress bars
globally.
We recommend the progress package as backend which can be enabled with
progressr::handlers("progress")
.
Logging
The mlr3 uses the lgr package for logging.
lgr supports multiple log levels which can be queried with
getOption("lgr.log_levels")
.
To suppress output and reduce verbosity, you can lower the log from the
default level "info"
to "warn"
:
lgr::get_logger("mlr3")$set_threshold("warn")
To get additional log output for debugging, increase the log level to "debug"
or "trace"
:
lgr::get_logger("mlr3")$set_threshold("debug")
To log to a file or a data base, see the documentation of lgr::lgr-package.
Note
The fitted models are discarded after the predictions have been scored in order to reduce memory consumption.
If you need access to the models for later analysis, set store_models
to TRUE
.
See Also
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter3/evaluation_and_benchmarking.html#sec-benchmarking
Package mlr3viz for some generic visualizations.
-
mlr3benchmark for post-hoc analysis of benchmark results.
Other benchmark:
BenchmarkResult
,
benchmark_grid()
Examples
# benchmarking with benchmark_grid()
tasks = lapply(c("penguins", "sonar"), tsk)
learners = lapply(c("classif.featureless", "classif.rpart"), lrn)
resamplings = rsmp("cv", folds = 3)
design = benchmark_grid(tasks, learners, resamplings)
print(design)
set.seed(123)
bmr = benchmark(design)
## Data of all resamplings
head(as.data.table(bmr))
## Aggregated performance values
aggr = bmr$aggregate()
print(aggr)
## Extract predictions of first resampling result
rr = aggr$resample_result[[1]]
as.data.table(rr$prediction())
# Benchmarking with a custom design:
# - fit classif.featureless on penguins with a 3-fold CV
# - fit classif.rpart on sonar using a holdout
tasks = list(tsk("penguins"), tsk("sonar"))
learners = list(lrn("classif.featureless"), lrn("classif.rpart"))
resamplings = list(rsmp("cv", folds = 3), rsmp("holdout"))
design = data.table::data.table(
task = tasks,
learner = learners,
resampling = resamplings
)
## Instantiate resamplings
design$resampling = Map(
function(task, resampling) resampling$clone()$instantiate(task),
task = design$task, resampling = design$resampling
)
## Run benchmark
bmr = benchmark(design)
print(bmr)
## Get the training set of the 2nd iteration of the featureless learner on penguins
rr = bmr$aggregate()[learner_id == "classif.featureless"]$resample_result[[1]]
rr$resampling$train_set(2)