Tune {ModTools} | R Documentation |
Tune Classificators
Description
Some classifiers benefit more from adjusted parameters to a particular dataset than others. However, it is often not clear from the beginning how the parameters have to be determined. What often only remains is a grid search when several parameters have to be found in combination. The present function uses a grid search approch for the decisive arguments (typically for a neural network, a random forest or a classification tree). However it's not restricted to these models, any model fulfilling weak interface standards could be provided.
Usage
Tune(x, ..., testset = NULL, keepmod = TRUE)
Arguments
x |
the model to be tuned, best (but not necessarily) trained with |
... |
a list of parameters, containing the values to be used for a grid search. |
testset |
a testset containing all variables required in the model to be used for calculating independently the accuracy (normally a subset of the original dataset). |
keepmod |
logical, defining if all fitted models should be returned in the result set. Default is |
Details
The function creates a n-dimensional grid according to the given parameters and calculates the model with the combinations of all the parameters. The accuracy for the models are calculated insample and on a test set, if one has been provided.
It makes sense to avoid overfitting to provide a test set to also be evaluated. A matrix with all combination of the values for the given parameters, sorted by accuracy, either by the accuracy achieved in the test set or the insample accuracy is returned.
Value
a matrix with all supplied parameters and a column "acc"
and "test_acc"
(if a test set has been provided)
Author(s)
Andri Signorell <andri@signorell.net>
Examples
d.pim <- SplitTrainTest(d.pima, p = 0.2)
mdiab <- formula(diabetes ~ pregnant + glucose + pressure + triceps
+ insulin + mass + pedigree + age)
# tune a neural network for size and decay
r.nn <- FitMod(mdiab, data=d.pim$train, fitfn="nnet")
(tu <- Tune(r.nn, size=12:17, decay = 10^(-4:-1), testset=d.pim$test))
# tune a random forest
r.rf <- FitMod(mdiab, data=d.pim$train, fitfn="randomForest")
(tu <- Tune(r.rf, mtry=seq(2, 20, 2), testset=d.pim$test))
# tune a SVM model
r.svm <- FitMod(mdiab, data=d.pim$train, fitfn="svm")
tu <- Tune(r.svm,
kernel=c("radial", "sigmoid"),
cost=c(0.1,1,10,100,1000),
gamma=c(0.5,1,2,3,4), testset=d.pim$test)
# let's get some more quality measures
tu$modpar$Sens <- sapply(tu$mods, Sens) # Sensitivity
tu$modpar$Spec <- sapply(tu$mods, Spec) # Specificity
Sort(tu$modpar, ord="test_acc", decreasing=TRUE)