tune {llama} | R Documentation |
Tune the hyperparameters of the machine learning algorithm underlying a model
Description
Functions to tune the hyperparameters of the machine learning algorithm underlying a model with respect to a performance measure.
Usage
tuneModel(ldf, llama.fun, learner, design, metric = parscores, nfolds = 10L,
quiet = FALSE)
Arguments
ldf |
the LLAMA data to use. The structure returned by |
llama.fun |
the LLAMA model building function. |
learner |
the mlr learner to use. |
design |
the data frame denoting the parameter values to try. Can be
produced with the |
metric |
the metric used to evaluate the model. Can be one of
|
nfolds |
the number of folds. Defaults to 10. If -1 is given, leave-one-out cross-validation folds are produced. |
quiet |
whether to output information on the intermediate values and progress during tuning. |
Details
tuneModel
finds the hyperparameters from the set denoted by design
of the machine learning algorithm learner
that give the best performance
with respect to the measure metric
for the LLAMA model type
llama.fun
on data ldf
. It uses a nested cross-validation
internally; the number of inner folds is given through nfolds
, the number
of outer folds is either determined by any existing partitions of ldf
or,
if none are present, by nfolds
as well.
During each iteration of the inner cross-validation, all parameter sets
specified in design
are evaluated and the one with the best performance
value chosen. The mean performance over all instances in the data is logged for
all evaluations. This parameter set is then used to build and evaluate a model
in the outer cross-validation. The predictions made by this model along with the
parameter values used to train it are returned.
Finally, a normal (not-nested) cross-validation is performed to find the best parameter values on the entire data set. The predictor of this model along with the parameter values used to train it is returned. The interface corresponds to the normal LLAMA model-building functions in that respect – the returned data structure is the same with a few additional values.
The evaluation across the folds sets will be parallelized automatically if a
suitable backend for parallel computation is loaded. The parallelMap
level is "llama.tune".
Value
predictions |
a data frame with the predictions for each instance and test set. The structure is the same as for the underlying model building function and the predictions are the ones made by the models trained with the best parameter values for the respective fold. |
predictor |
a function that encapsulates the classifier learned on the
entire data set with the best parameter values determined on the
entire data set. Can be called with data for the same features with the
same feature names as the training data to obtain predictions in the same
format as the |
models |
the list of models trained on the entire data set. This is meant for debugging/inspection purposes. |
parvals |
the best parameter values on the entire data set used for
training the |
inner.parvals |
the best parameter values during each iteration of the
outer cross-validation. These parameters were used to train the models that
made the predictions in |
Author(s)
Bernd Bischl, Lars Kotthoff
Examples
if(Sys.getenv("RUN_EXPENSIVE") == "true") {
library(ParamHelpers)
data(satsolvers)
learner = makeLearner("classif.J48")
# parameter set for J48
ps = makeParamSet(makeIntegerParam("M", lower = 1, upper = 100))
# generate 10 random parameter sets
design = generateRandomDesign(10, ps)
# tune with respect to PAR10 score (default) with 10 outer and inner folds
# (default)
res = tuneModel(satsolvers, classify, learner, design)
}