R: Hyperparameter optimisation or parameter tuning for Gradient...

gradient_boosting_parameters {scorecardModelUtils}

R Documentation

Hyperparameter optimisation or parameter tuning for Gradient Boosting Regression Modelling by grid search

Description

The function runs a grid search with k-fold cross validation to arrive at best parameter decided by some performance measure. The parameters that can be tuned using this function for gradient boosting regression modelling algorithm are - ntree, depth, shrinkage, min_obs and bag_fraction. The objective function to be minimised is the error (mean absolute error / mean squared error / root mean squared error). For the grid search, the possible values of each tuning parameter needs to be passed as an array into the function.

Usage

gradient_boosting_parameters(base, target, ntree, depth, shrinkage, min_obs,
  bag_fraction, error = "rmse", cv = 1)

Arguments

`base`	input dataframe
`target`	column / field name for the target variable to be passed as string (must be 0/1 type)
`ntree`	number of trees to be fitted
`depth`	maximum depth of variable interactions
`shrinkage`	learning rate
`min_obs`	minimum size of terminal nodes
`bag_fraction`	fraction of the training set observations randomly selected for next tree
`error`	(optional) error measure as objective function to be minimised, to be chosen among "mae", "mse" and "rmse" (default value is "rmse")
`cv`	(optional) k vakue for k-fold cross validation to be performed (default value is 1 ie. without cross validation)

Value

An object of class "gradient_boosting_parameters" is a list containing the following components:

`error_tab_detailed`	error summary for each cross validation sample of the parameter combinations iterated during grid search as a dataframe
`error_tab_summary`	error summary for each combination of parameters as a dataframe
`best_ntree`	ntree parameter of the optimal solution
`best_depth`	depth parameter of the optimal solution
`best_shrinkage`	shrinkage parameter of the optimal solution
`best_min_obs`	cost min_obs of the optimal solution
`best_bag_fraction`	bag_fraction parameter of the optimal solution
`runtime`	runtime of the entire process

Author(s)

Arya Poddar <aryapoddar290990@gmail.com>

Examples

data <- iris
suppressWarnings(RNGversion('3.5.0'))
set.seed(11)
data$Y <- sample(0:1,size=nrow(data),replace=TRUE)
gbm_params_list <- gradient_boosting_parameters(base = data,target = "Y",ntree = 2,depth = 2,
                   shrinkage = 0.1,min_obs = 0.1,bag_fraction = 0.7)
gbm_params_list$error_tab_detailed
gbm_params_list$error_tab_summary
gbm_params_list$best_ntree
gbm_params_list$best_depth
gbm_params_list$best_shrinkage
gbm_params_list$best_min_obs
gbm_params_list$best_bag_fraction
gbm_params_list$runtime

[Package scorecardModelUtils version 0.0.1.0 Index]