Boost.validation {RRBoost} | R Documentation |
Robust Boosting for regression with initialization parameters chosen on a validation set
Description
A function to fit RRBoost (see also Boost
) where the initialization parameters are chosen
based on the performance on the validation set.
Usage
Boost.validation(
x_train,
y_train,
x_val,
y_val,
x_test,
y_test,
type = "RRBoost",
error = c("rmse", "aad"),
niter = 1000,
max_depth = 1,
y_init = "LADTree",
max_depth_init_set = c(1, 2, 3, 4),
min_leaf_size_init_set = c(10, 20, 30),
control = Boost.control()
)
Arguments
x_train |
predictor matrix for training data (matrix/dataframe) |
y_train |
response vector for training data (vector/dataframe) |
x_val |
predictor matrix for validation data (matrix/dataframe) |
y_val |
response vector for validation data (vector/dataframe) |
x_test |
predictor matrix for test data (matrix/dataframe, optional, required when |
y_test |
response vector for test data (vector/dataframe, optional, required when |
type |
type of the boosting method: "L2Boost", "LADBoost", "MBoost", "Robloss", "SBoost", "RRBoost" (character string) |
error |
a character string (or vector of character strings) indicating the types of error metrics to be evaluated on the test set. Valid options are: "rmse" (root mean squared error), "aad" (average absulute deviation), and "trmse" (trimmed root mean squared error) |
niter |
number of iterations (for RRBoost T_1,max + T_2,max) (numeric) |
max_depth |
the maximum depth of the tree learners (numeric) |
y_init |
a string indicating the initial estimator to be used. Valid options are: "median" or "LADTree" (character string) |
max_depth_init_set |
a vector of possible values of the maximum depth of the initial LADTree that the algorithm choses from |
min_leaf_size_init_set |
a vector of possible values of the minimum observations per node of the initial LADTree that the algorithm choses from |
control |
a named list of control parameters, as returned by |
Details
This function runs the RRBoost algorithm (see Boost
) on different combinations of the
parameters for the initial fit, and chooses the optimal set based on the performance on the validation set.
Value
A list with components
the components of model |
an object returned by Boost that is trained with selected initialization parameters |
param |
a vector of selected initialization parameters (return (0,0) if selected initialization is the median of the training responses) |
Author(s)
Xiaomeng Ju, xmengju@stat.ubc.ca
See Also
Examples
data(airfoil)
n <- nrow(airfoil)
n0 <- floor( 0.2 * n )
set.seed(123)
idx_test <- sample(n, n0)
idx_train <- sample((1:n)[-idx_test], floor( 0.6 * n ) )
idx_val <- (1:n)[ -c(idx_test, idx_train) ]
xx <- airfoil[, -6]
yy <- airfoil$y
xtrain <- xx[ idx_train, ]
ytrain <- yy[ idx_train ]
xval <- xx[ idx_val, ]
yval <- yy[ idx_val ]
xtest <- xx[ idx_test, ]
ytest <- yy[ idx_test ]
model_RRBoost_cv_LADTree = Boost.validation(x_train = xtrain,
y_train = ytrain, x_val = xval, y_val = yval,
x_test = xtest, y_test = ytest, type = "RRBoost", error = "rmse",
y_init = "LADTree", max_depth = 1, niter = 1000,
max_depth_init_set = 1:5,
min_leaf_size_init_set = c(10,20,30),
control = Boost.control(make_prediction = TRUE,
cal_imp = TRUE))