eztune {EZtune}R Documentation

Supervised Learning Function

Description

eztune is a function that automatically tunes adaboost, support vector machines, gradient boosting machines, and elastic net. An optimization algorithm is used to find a good set of tuning parameters for the selected model. The function optimizes on a validation dataset, cross validated accuracy, or resubstitution accuracy.

Usage

eztune(
  x,
  y,
  method = "svm",
  optimizer = "hjn",
  fast = TRUE,
  cross = NULL,
  loss = "default"
)

Arguments

x

Matrix or data frame containing the dependent variables.

y

Vector of responses. Can either be a factor or a numeric vector.

method

Model to be fit. Choices are "ada" for adaboost, "en" for elastic net, "gbm" for gradient boosting machines, and "svm" for support vector machines.

optimizer

Optimization method. Options are "ga" for a genetic algorithm and "hjn" for a Hooke-Jeeves optimizer.

fast

Indicates if the function should use a subset of the observations when optimizing to speed up calculation time. A value of TRUE will use the smaller of 50% of the data or 200 observations for model fitting, a number between 0 and 1 specifies the proportion of data to be used to fit the model, and a positive integer specifies the number of observations to be used to fit the model. A model is computed using a random selection of data and the remaining data are used to validate model performance. The validation error measure is used as the optimization criterion.

cross

If an integer k \> 1 is specified, k-fold cross-validation is used to fit the model. This method is very slow for large datasets. This parameter is ignored unless fast = FALSE.

loss

The type of loss function used for optimization. Options for models with a binary response are "class" for classification error and "auc" for area under the curve. Options for models with a continuous response are "mse" for mean squared error and "mae" for mean absolute error. If the option "default" is selected, or no loss is specified, the classification accuracy will be used for a binary response model and the MSE will be use for models with a continuous model.

Value

Function returns an object of class "eztune" which contains a summary of the tuning parameters for the best model, the best loss measure achieved (classification accuracy, AUC, MSE, or MAE), and the best model.

loss

Best loss measure obtained by the optimizer. This is the measure specified by the user that the optimizer uses to choose a "best" model (classification accuracy, AUC, MSE, or MAE). Note that if the default option is used it is the classification accuracy for a binary response and the MSE for a continuous response.

model

Best model found by the optimizer. Adaboost model comes from package ada (ada object), elastic net model comes from package glmnet (glmnet object), gbm model comes from package gbm (gbm.object object), svm (svm object) model comes from package e1071.

n

Number of observations used in model training when fast option is used

nfold

Number of folds used if cross validation is used for optimization.

iter

Tuning parameter for adaboost.

nu

Tuning parameter for adaboost.

shrinkage

Tuning parameter for adaboost and gbm.

lambda

Tuning parameter for elastic net

alpha

Tuning parameter for elastic net

n.trees

Tuning parameter for gbm.

interaction.depth

Tuning parameter for gbm.

n.minobsinnode

Tuning parameter for gbm.

cost

Tuning parameter for svm.

gamma

Tuning parameter for svm.

epsilon

Tuning parameter for svm regression.

levels

If the model has a binary response, the levels of y are listed.

Examples

library(mlbench)
data(Sonar)
sonar <- Sonar[sample(1:nrow(Sonar), 100), ]

y <- sonar[, 61]
x <- sonar[, 1:10]

# Optimize an SVM using the default fast setting and Hooke-Jeeves
eztune(x, y)

# Optimize an SVM with 3-fold cross validation and Hooke-Jeeves
eztune(x, y, fast = FALSE, cross = 3)

# Optimize GBM using training set of 50 observations and Hooke-Jeeves
eztune(x, y, method = "gbm", fast = 50, loss = "auc")

# Optimize SVM with 25% of the observations as a training dataset
# using a genetic algorithm
eztune(x, y, method = "svm", optimizer = "ga", fast = 0.25)


[Package EZtune version 3.1.1 Index]