tunePareto {TunePareto} | R Documentation |
Generic function for multi-objective parameter tuning of classifiers
Description
This generic function tunes parameters of arbitrary classifiers in a multi-objective setting and returns the Pareto-optimal parameter combinations.
Usage
tunePareto(..., data, labels,
classifier, parameterCombinations,
sampleType = c("full","uniform",
"latin","halton",
"niederreiter","sobol",
"evolution"),
numCombinations,
mu=10, lambda=20, numIterations=100,
objectiveFunctions, objectiveBoundaries,
keepSeed = TRUE, useSnowfall = FALSE, verbose=TRUE)
Arguments
data |
The data set to be used for the parameter tuning. This is usually a matrix or data frame with the samples in the rows and the features in the columns. |
labels |
A vector of class labels for the samples in |
classifier |
A |
parameterCombinations |
If not all combinations of parameter ranges for the classifier are meaningful, you can set this parameter instead of specifying parameter values in the ... argument. It holds an explicit list of possible combinations, where each element of the list is a named sublist with one entry for each parameter. |
sampleType |
Determines the way parameter configurations are sampled. If If If If If |
numCombinations |
If this parameter is set, at most |
mu |
The number of individuals used in the Evolution Strategies if |
lambda |
The number of offspring per generation in the Evolution Strategies if |
numIterations |
The number of iterations/generations the evolutionary algorithm is run if |
objectiveFunctions |
A list of objective functions used to tune the parameters. There are a number of predefined objective functions (see |
objectiveBoundaries |
If this parameter is set, it specifies boundaries of the objective functions for valid solutions. That is, each element of the supplied vector specifies the upper or lower limit of an objective (depending on whether the objective is maximized or minimized). Parameter combinations that do not meet all these restrictions are not included in the result set, even if they are Pareto-optimal. If only some of the objectives should have bounds, supply |
keepSeed |
If this is true, the random seed is reset to the same value for each of the tested parameter configurations. This is an easy way to guarantee comparability in randomized objective functions. E.g., cross-validation runs of the classifiers will all start with the same seed, which results in the same partitions. Attention: If you set this parameter to |
useSnowfall |
If this parameter is true, the routine loads the snowfall package and processes the parameter configurations in parallel. Please note that the snowfall cluster has to be initialized properly before running the tuning function and stopped after the run. |
verbose |
If this parameter is true, status messages are printed. In particular, the algorithm prints the currently tested combination. |
... |
The parameters of the classifier and predictor functions that should be tuned. The names of the parameters must correspond to the parameters specified in |
Details
This is a generic function that allows for parameter tuning of a wide variety of classifiers. You can either specify the values or intervals of tuned parameters in the ...
argument, or supply selected combinations of parameter values using parameterCombinations
. In the first case, combinations of parameter values specified in the ...
argument are generated. If sampleType="uniform"
, sampleType="latin"
, sampleType="halton"
, sampleType="niederreiter"
or sampleType="sobol"
, a random subset of the possible combinations is drawn. If sampleType="evolution"
, random parameter combinations are generated and optimized using Evolution Strategies.
In the latter case, only the parameter combinations specified explicitly in parameterCombinations
are tested. This is useful if certain parameter combinations are invalid. You can create parameter combinations by concatenating results of calls to allCombinations
. Only sampleType="full"
is allowed in this mode.
For each of the combinations, the specified objective functions are calculated. This usually involves training and testing a classifier. From the resulting objective values, the non-dominated parameter configurations are calculated and returned.
The ...
argument is the first argument of tunePareto
for technical reasons (to prevent partial matching of the supplied parameters with argument names of tunePareto
. This requires all arguments to be named.
Value
Returns a list of class TuneParetoResult
with the following components:
bestCombinations |
A list of Pareto-optimal parameter configurations. Each element of the list consists of a sub-list with named elements corresponding to the parameter values. |
bestObjectiveValues |
A matrix containing the objective function values of the Pareto-optimal configurations in |
testedCombinations |
A list of all tested parameter configurations with the same structure as |
testedObjectiveValues |
A matrix containing the objective function values of all tested configurations with the same structure as |
dominationMatrix |
A Boolean matrix specifying which parameter configurations dominate each other. If a configuration |
minimizeObjectives |
A Boolean vector specifying which of the objectives are minimization objectives. This is derived from the objective functions supplied to |
additionalData |
A list containing additional data that may have been returned by the objective functions. The list has one element for each tested parameter configuration, each comprising one sub-element for each objective function that returned additional data. The structure of these sub-elements depends on the corresponding objective function. For example, the predefined objective functions (see |
See Also
predefinedClassifiers
, predefinedObjectiveFunctions
, createObjective
, allCombinations
Examples
# tune 'k' of a k-NN classifier
# on two classes of the 'iris' data set --
# see ?knn
print(tunePareto(data = iris[, -ncol(iris)],
labels = iris[, ncol(iris)],
classifier = tunePareto.knn(),
k = c(1,3,5,7,9),
objectiveFunctions = list(cvError(10, 10),
reclassError())))
# example using predefined parameter configurations,
# as certain combinations of k and l are invalid:
comb <- c(allCombinations(list(k=1,l=0)),
allCombinations(list(k=3,l=0:2)),
allCombinations(list(k=5,l=0:4)),
allCombinations(list(k=7,l=0:6)))
print(tunePareto(data = iris[, -ncol(iris)],
labels = iris[, ncol(iris)],
classifier = tunePareto.knn(),
parameterCombinations = comb,
objectiveFunctions = list(cvError(10, 10),
reclassError())))
# tune 'cost' and 'kernel' of an SVM on
# the 'iris' data set using Latin Hypercube sampling --
# see ?svm and ?predict.svm
print(tunePareto(data = iris[, -ncol(iris)],
labels = iris[, ncol(iris)],
classifier = tunePareto.svm(),
cost = as.interval(0.001,10),
kernel = c("linear", "polynomial",
"radial", "sigmoid"),
sampleType="latin",
numCombinations=20,
objectiveFunctions = list(cvError(10, 10),
cvSensitivity(10, 10, caseClass="setosa"))))
# tune the same parameters using Evolution Strategies
print(tunePareto(data = iris[, -ncol(iris)],
labels = iris[, ncol(iris)],
classifier = tunePareto.svm(),
cost = as.interval(0.001,10),
kernel = c("linear", "polynomial",
"radial", "sigmoid"),
sampleType="evolution",
numCombinations=20,
numIterations=20,
objectiveFunctions = list(cvError(10, 10),
cvSensitivity(10, 10, caseClass="setosa"),
cvSpecificity(10, 10, caseClass="setosa"))))