Boost.validation {RRBoost}R Documentation

Robust Boosting for regression with initialization parameters chosen on a validation set

Description

A function to fit RRBoost (see also Boost) where the initialization parameters are chosen based on the performance on the validation set.

Usage

Boost.validation(
  x_train,
  y_train,
  x_val,
  y_val,
  x_test,
  y_test,
  type = "RRBoost",
  error = c("rmse", "aad"),
  niter = 1000,
  max_depth = 1,
  y_init = "LADTree",
  max_depth_init_set = c(1, 2, 3, 4),
  min_leaf_size_init_set = c(10, 20, 30),
  control = Boost.control()
)

Arguments

x_train

predictor matrix for training data (matrix/dataframe)

y_train

response vector for training data (vector/dataframe)

x_val

predictor matrix for validation data (matrix/dataframe)

y_val

response vector for validation data (vector/dataframe)

x_test

predictor matrix for test data (matrix/dataframe, optional, required when make_prediction in control is TRUE)

y_test

response vector for test data (vector/dataframe, optional, required when make_prediction in control is TRUE)

type

type of the boosting method: "L2Boost", "LADBoost", "MBoost", "Robloss", "SBoost", "RRBoost" (character string)

error

a character string (or vector of character strings) indicating the types of error metrics to be evaluated on the test set. Valid options are: "rmse" (root mean squared error), "aad" (average absulute deviation), and "trmse" (trimmed root mean squared error)

niter

number of iterations (for RRBoost T_1,max + T_2,max) (numeric)

max_depth

the maximum depth of the tree learners (numeric)

y_init

a string indicating the initial estimator to be used. Valid options are: "median" or "LADTree" (character string)

max_depth_init_set

a vector of possible values of the maximum depth of the initial LADTree that the algorithm choses from

min_leaf_size_init_set

a vector of possible values of the minimum observations per node of the initial LADTree that the algorithm choses from

control

a named list of control parameters, as returned by Boost.control

Details

This function runs the RRBoost algorithm (see Boost) on different combinations of the parameters for the initial fit, and chooses the optimal set based on the performance on the validation set.

Value

A list with components

the components of model

an object returned by Boost that is trained with selected initialization parameters

param

a vector of selected initialization parameters (return (0,0) if selected initialization is the median of the training responses)

Author(s)

Xiaomeng Ju, xmengju@stat.ubc.ca

See Also

Boost, Boost.control.

Examples


data(airfoil)
n <- nrow(airfoil)
n0 <- floor( 0.2 * n )
set.seed(123)
idx_test <- sample(n, n0)
idx_train <- sample((1:n)[-idx_test], floor( 0.6 * n ) )
idx_val <- (1:n)[ -c(idx_test, idx_train) ]
xx <- airfoil[, -6]
yy <- airfoil$y
xtrain <- xx[ idx_train, ]
ytrain <- yy[ idx_train ]
xval <- xx[ idx_val, ]
yval <- yy[ idx_val ]
xtest <- xx[ idx_test, ]
ytest <- yy[ idx_test ]
model_RRBoost_cv_LADTree = Boost.validation(x_train = xtrain,
      y_train = ytrain, x_val = xval, y_val = yval,
      x_test = xtest, y_test = ytest, type = "RRBoost", error = "rmse",
      y_init = "LADTree", max_depth = 1, niter = 1000,
      max_depth_init_set = 1:5,
      min_leaf_size_init_set = c(10,20,30),
      control = Boost.control(make_prediction =  TRUE,
      cal_imp = TRUE))



[Package RRBoost version 0.1 Index]