Boost {RRBoost} | R Documentation |
Robust Boosting for regression
Description
This function implements the RRBoost robust boosting algorithm for regression, as well as other robust and non-robust boosting algorithms for regression.
Usage
Boost(
x_train,
y_train,
x_val,
y_val,
x_test,
y_test,
type = "RRBoost",
error = c("rmse", "aad"),
niter = 200,
y_init = "LADTree",
max_depth = 1,
tree_init_provided = NULL,
control = Boost.control()
)
Arguments
x_train |
predictor matrix for training data (matrix/dataframe) |
y_train |
response vector for training data (vector/dataframe) |
x_val |
predictor matrix for validation data (matrix/dataframe) |
y_val |
response vector for validation data (vector/dataframe) |
x_test |
predictor matrix for test data (matrix/dataframe, optional, required when |
y_test |
response vector for test data (vector/dataframe, optional, required when |
type |
type of the boosting method: "L2Boost", "LADBoost", "MBoost", "Robloss", "SBoost", "RRBoost" (character string) |
error |
a character string (or vector of character strings) indicating the type of error metrics to be evaluated on the test set. Valid options are: "rmse" (root mean squared error), "aad" (average absolute deviation), and "trmse" (trimmed root mean squared error) |
niter |
number of boosting iterations (for RRBoost: T_1,max + T_2,max) (numeric) |
y_init |
a string indicating the initial estimator to be used. Valid options are: "median" or "LADTree" (character string) |
max_depth |
the maximum depth of the tree learners (numeric) |
tree_init_provided |
an optional pre-fitted initial tree (an |
control |
a named list of control parameters, as returned by |
Details
This function implements a robust boosting algorithm for regression (RRBoost).
It also includes the following robust and non-robust boosting algorithms
for regression: L2Boost, LADBoost, MBoost, Robloss, and SBoost. This function
uses the functions available in the rpart
package to construct binary regression trees.
Value
A list with the following components:
type |
which boosting algorithm was run. One of: "L2Boost", "LADBoost", "MBoost", "Robloss", "SBoost", "RRBoost" (character string) |
control |
the list of control parameters used |
niter |
number of iterations for the boosting algorithm (for RRBoost T_1,max + T_2,max) (numeric) |
error |
if |
tree_init |
if |
tree_list |
if |
f_train_init |
a vector of the initialized estimator of the training data |
alpha |
a vector of base learners' coefficients |
early_stop_idx |
early stopping iteration |
when_init |
if |
loss_train |
a vector of training loss values (one per iteration) |
loss_val |
a vector of validation loss values (one per iteration) |
err_val |
a vector of validation aad errors (one per iteration) |
err_train |
a vector of training aad errors (one per iteration) |
err_test |
a matrix of test errors before and at the early stopping iteration (returned if make_prediction = TRUE in control); the matrix dimension is the early stopping iteration by the number of error types (matches the |
f_train |
a matrix of training function estimates at all iterations (returned if save_f = TRUE in control); each column corresponds to the fitted values of the predictor at each iteration |
f_val |
a matrix of validation function estimates at all iterations (returned if save_f = TRUE in control); each column corresponds to the fitted values of the predictor at each iteration |
f_test |
a matrix of test function estimatesbefore and at the early stopping iteration (returned if save_f = TRUE and make_prediction = TRUE in control); each column corresponds to the fitted values of the predictor at each iteration |
var_select |
a vector of variable selection indicators (one per explanatory variable; 1 if the variable was selected by at least one of the base learners, and 0 otherwise) |
var_importance |
a vector of permutation variable importance scores (one per explanatory variable, and returned if cal_imp = TRUE in control) |
Author(s)
Xiaomeng Ju, xmengju@stat.ubc.ca
See Also
Boost.validation
, Boost.control
.
Examples
data(airfoil)
n <- nrow(airfoil)
n0 <- floor( 0.2 * n )
set.seed(123)
idx_test <- sample(n, n0)
idx_train <- sample((1:n)[-idx_test], floor( 0.6 * n ) )
idx_val <- (1:n)[ -c(idx_test, idx_train) ]
xx <- airfoil[, -6]
yy <- airfoil$y
xtrain <- xx[ idx_train, ]
ytrain <- yy[ idx_train ]
xval <- xx[ idx_val, ]
yval <- yy[ idx_val ]
xtest <- xx[ idx_test, ]
ytest <- yy[ idx_test ]
model_RRBoost_LADTree = Boost(x_train = xtrain, y_train = ytrain,
x_val = xval, y_val = yval, x_test = xtest, y_test = ytest,
type = "RRBoost", error = "rmse", y_init = "LADTree",
max_depth = 1, niter = 10, ## to keep the running time low
control = Boost.control(max_depth_init = 2,
min_leaf_size_init = 20, make_prediction = TRUE,
cal_imp = FALSE))