XGBModel {MachineShop}R Documentation

Extreme Gradient Boosting Models


Fits models with an efficient implementation of the gradient boosting framework from Chen & Guestrin.


  nrounds = 100,
  objective = character(),
  aft_loss_distribution = "normal",
  aft_loss_distribution_scale = 1,
  base_score = 0.5,
  verbose = 0,
  print_every_n = 1

  eta = 0.3,
  gamma = 0,
  max_depth = 6,
  min_child_weight = 1,
  max_delta_step = .(0.7 * is(y, "PoissonVariate")),
  subsample = 1,
  colsample_bytree = 1,
  colsample_bylevel = 1,
  colsample_bynode = 1,
  alpha = 0,
  lambda = 1,
  tree_method = "auto",
  sketch_eps = 0.03,
  scale_pos_weight = 1,
  refresh_leaf = 1,
  process_type = "default",
  grow_policy = "depthwise",
  max_leaves = 0,
  max_bin = 256,
  num_parallel_tree = 1,
  sample_type = "uniform",
  normalize_type = "tree",
  rate_drop = 0,
  one_drop = 0,
  skip_drop = 0,

  alpha = 0,
  lambda = 0,
  updater = "shotgun",
  feature_selector = "cyclic",
  top_k = 0,

  eta = 0.3,
  gamma = 0,
  max_depth = 6,
  min_child_weight = 1,
  max_delta_step = .(0.7 * is(y, "PoissonVariate")),
  subsample = 1,
  colsample_bytree = 1,
  colsample_bylevel = 1,
  colsample_bynode = 1,
  alpha = 0,
  lambda = 1,
  tree_method = "auto",
  sketch_eps = 0.03,
  scale_pos_weight = 1,
  refresh_leaf = 1,
  process_type = "default",
  grow_policy = "depthwise",
  max_leaves = 0,
  max_bin = 256,
  num_parallel_tree = 1,



number of boosting iterations.


model parameters as described below and in the XGBoost documentation and arguments passed to XGBModel from the other constructors.


optional character string defining the learning task and objective. Set automatically if not specified according to the following values available for supported response variable types.


"multi:softprob", "binary:logistic" (2 levels only)


"reg:squarederror", "reg:logistic", "reg:gamma", "reg:tweedie", "rank:pairwise", "rank:ndcg", "rank:map"




"survival:aft", "survival:cox"

The first values listed are the defaults for the corresponding response types.


character string specifying a distribution for the accelerated failure time objective ("survival:aft") as "extreme", "logistic", or "normal".


numeric scaling parameter for the accelerated failure time distribution.


initial prediction score of all observations, global bias.


numeric value controlling the amount of output printed during model fitting, such that 0 = none, 1 = performance information, and 2 = additional information.


numeric value designating the fitting iterations at at which to print output when verbose > 0.


shrinkage of variable weights at each iteration to prevent overfitting.


minimum loss reduction required to split a tree node.


maximum tree depth.


minimum sum of observation weights required of nodes.

max_delta_step, tree_method, sketch_eps, scale_pos_weight, updater, refresh_leaf, process_type, grow_policy, max_leaves, max_bin, num_parallel_tree

other tree booster parameters.


subsample ratio of the training observations.

colsample_bytree, colsample_bylevel, colsample_bynode

subsample ratio of variables for each tree, level, or split.

alpha, lambda

L1 and L2 regularization terms for variable weights.

sample_type, normalize_type

type of sampling and normalization algorithms.


rate at which to drop trees during the dropout procedure.


integer indicating whether to drop at least one tree during the dropout procedure.


probability of skipping the dropout procedure during a boosting iteration.

feature_selector, top_k

character string specifying the feature selection and ordering method, and number of top variables to select in the "greedy" and "thrifty" feature selectors.


Response types:

factor, numeric, PoissonVariate, Surv

Automatic tuning of grid parameters:
  • XGBModel: NULL

  • XGBDARTModel: nrounds, eta*, gamma*, max_depth, min_child_weight*, subsample*, colsample_bytree*, rate_drop*, skip_drop*

  • XGBLinearModel: nrounds, alpha, lambda

  • XGBTreeModel: nrounds, eta*, gamma*, max_depth, min_child_weight*, subsample*, colsample_bytree*

* excluded from grids by default

The booster-specific constructor functions XGBDARTModel, XGBLinearModel, and XGBTreeModel are special cases of XGBModel which automatically set the XGBoost booster parameter. These are called directly in typical usage unless XGBModel is needed to specify a more general model.

Default argument values and further model details can be found in the source See Also link below.

In calls to varimp for XGBTreeModel, argument type may be specified as "Gain" (default) for the fractional contribution of each predictor to the total gain of its splits, as "Cover" for the number of observations related to each predictor, or as "Frequency" for the percentage of times each predictor is used in the trees. Variable importance is automatically scaled to range from 0 to 100. To obtain unscaled importance values, set scale = FALSE. See example below.


MLModel class object.

See Also

xgboost, fit, resample


## Requires prior installation of suggested package xgboost to run

model_fit <- fit(Species ~ ., data = iris, model = XGBTreeModel)
varimp(model_fit, method = "model", type = "Frequency", scale = FALSE)

[Package MachineShop version 3.7.0 Index]