GridSearchCV {superml}R Documentation

Grid Search CV

Description

Runs grid search cross validation scheme to find best model training parameters.

Details

Grid search CV is used to train a machine learning model with multiple combinations of training hyper parameters and finds the best combination of parameters which optimizes the evaluation metric. It creates an exhaustive set of hyperparameter combinations and train model on each combination.

Public fields

trainer

superml trainer object, could be either XGBTrainer, RFTrainer, NBTrainer etc.

parameters

a list of parameters to tune

n_folds

number of folds to use to split the train data

scoring

scoring metric used to evaluate the best model, multiple values can be provided. currently supports: auc, accuracy, mse, rmse, logloss, mae, f1, precision, recall

evaluation_scores

parameter for internal use

Methods

Public methods


Method new()

Usage
GridSearchCV$new(trainer = NA, parameters = NA, n_folds = NA, scoring = NA)
Arguments
trainer

superml trainer object, could be either XGBTrainer, RFTrainer, NBTrainer etc.

parameters

list, a list of parameters to tune

n_folds

integer, number of folds to use to split the train data

scoring

character, scoring metric used to evaluate the best model, multiple values can be provided. currently supports: auc, accuracy, mse, rmse, logloss, mae, f1, precision, recall

Details

Create a new 'GridSearchCV' object.

Returns

A 'GridSearchCV' object.

Examples
rf <- RFTrainer$new()
gst <-GridSearchCV$new(trainer = rf,
                      parameters = list(n_estimators = c(100),
                                        max_depth = c(5,2,10)),
                                        n_folds = 3,
                                        scoring = c('accuracy','auc'))

Method fit()

Usage
GridSearchCV$fit(X, y)
Arguments
X

data.frame or data.table

y

character, name of target variable

Details

Trains the model using grid search

Returns

NULL

Examples
rf <- RFTrainer$new()
gst <-GridSearchCV$new(trainer = rf,
                      parameters = list(n_estimators = c(100),
                                        max_depth = c(5,2,10)),
                                        n_folds = 3,
                                        scoring = c('accuracy','auc'))
data("iris")
gst$fit(iris, "Species")

Method best_iteration()

Usage
GridSearchCV$best_iteration(metric = NULL)
Arguments
metric

character, which metric to use for evaluation

Details

Returns the best parameters

Returns

a list of best parameters

Examples
rf <- RFTrainer$new()
gst <-GridSearchCV$new(trainer = rf,
                      parameters = list(n_estimators = c(100),
                                        max_depth = c(5,2,10)),
                                        n_folds = 3,
                                        scoring = c('accuracy','auc'))
data("iris")
gst$fit(iris, "Species")
gst$best_iteration()

Method clone()

The objects of this class are cloneable with this method.

Usage
GridSearchCV$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

Examples


## ------------------------------------------------
## Method `GridSearchCV$new`
## ------------------------------------------------

rf <- RFTrainer$new()
gst <-GridSearchCV$new(trainer = rf,
                      parameters = list(n_estimators = c(100),
                                        max_depth = c(5,2,10)),
                                        n_folds = 3,
                                        scoring = c('accuracy','auc'))

## ------------------------------------------------
## Method `GridSearchCV$fit`
## ------------------------------------------------

rf <- RFTrainer$new()
gst <-GridSearchCV$new(trainer = rf,
                      parameters = list(n_estimators = c(100),
                                        max_depth = c(5,2,10)),
                                        n_folds = 3,
                                        scoring = c('accuracy','auc'))
data("iris")
gst$fit(iris, "Species")

## ------------------------------------------------
## Method `GridSearchCV$best_iteration`
## ------------------------------------------------

rf <- RFTrainer$new()
gst <-GridSearchCV$new(trainer = rf,
                      parameters = list(n_estimators = c(100),
                                        max_depth = c(5,2,10)),
                                        n_folds = 3,
                                        scoring = c('accuracy','auc'))
data("iris")
gst$fit(iris, "Species")
gst$best_iteration()

[Package superml version 0.5.7 Index]