mlr_learners_regr.xgboost {mlr3learners}R Documentation

Extreme Gradient Boosting Regression Learner


eXtreme Gradient Boosting regression. Calls xgboost::xgb.train() from package xgboost.

To compute on GPUs, you first need to compile xgboost yourself and link against CUDA. See

Note that using the watchlist parameter directly will lead to problems when wrapping this Learner in a mlr3pipelines GraphLearner as the preprocessing steps will not be applied to the data in the watchlist.


This Learner can be instantiated via the dictionary mlr_learners or with the associated sugar function lrn():


Meta Information


Id Type Default Levels Range
alpha numeric 0 [0, \infty)
approxcontrib logical FALSE TRUE, FALSE -
base_score numeric 0.5 (-\infty, \infty)
booster character gbtree gbtree, gblinear, dart -
callbacks untyped list -
colsample_bylevel numeric 1 [0, 1]
colsample_bynode numeric 1 [0, 1]
colsample_bytree numeric 1 [0, 1]
device untyped cpu -
disable_default_eval_metric logical FALSE TRUE, FALSE -
early_stopping_rounds integer NULL [1, \infty)
early_stopping_set character none none, train, test -
eta numeric 0.3 [0, 1]
eval_metric untyped rmse -
feature_selector character cyclic cyclic, shuffle, random, greedy, thrifty -
feval untyped -
gamma numeric 0 [0, \infty)
grow_policy character depthwise depthwise, lossguide -
interaction_constraints untyped - -
iterationrange untyped - -
lambda numeric 1 [0, \infty)
lambda_bias numeric 0 [0, \infty)
max_bin integer 256 [2, \infty)
max_delta_step numeric 0 [0, \infty)
max_depth integer 6 [0, \infty)
max_leaves integer 0 [0, \infty)
maximize logical NULL TRUE, FALSE -
min_child_weight numeric 1 [0, \infty)
missing numeric NA (-\infty, \infty)
monotone_constraints untyped 0 -
normalize_type character tree tree, forest -
nrounds integer - [1, \infty)
nthread integer 1 [1, \infty)
ntreelimit integer NULL [1, \infty)
num_parallel_tree integer 1 [1, \infty)
objective untyped reg:squarederror -
one_drop logical FALSE TRUE, FALSE -
outputmargin logical FALSE TRUE, FALSE -
predcontrib logical FALSE TRUE, FALSE -
predinteraction logical FALSE TRUE, FALSE -
predleaf logical FALSE TRUE, FALSE -
print_every_n integer 1 [1, \infty)
process_type character default default, update -
rate_drop numeric 0 [0, 1]
refresh_leaf logical TRUE TRUE, FALSE -
reshape logical FALSE TRUE, FALSE -
sampling_method character uniform uniform, gradient_based -
sample_type character uniform uniform, weighted -
save_name untyped -
save_period integer NULL [0, \infty)
scale_pos_weight numeric 1 (-\infty, \infty)
seed_per_iteration logical FALSE TRUE, FALSE -
skip_drop numeric 0 [0, 1]
strict_shape logical FALSE TRUE, FALSE -
subsample numeric 1 [0, 1]
top_k integer 0 [0, \infty)
training logical FALSE TRUE, FALSE -
tree_method character auto auto, exact, approx, hist, gpu_hist -
tweedie_variance_power numeric 1.5 [1, 2]
updater untyped - -
verbose integer 1 [0, 2]
watchlist untyped -
xgb_model untyped -

Early stopping

Early stopping can be used to find the optimal number of boosting rounds. The early_stopping_set parameter controls which set is used to monitor the performance. Set early_stopping_set = "test" to monitor the performance of the model on the test set while training. The test set for early stopping can be set with the "test" row role in the mlr3::Task. Additionally, the range must be set in which the performance must increase with early_stopping_rounds and the maximum number of boosting rounds with nrounds. While resampling, the test set is automatically applied from the mlr3::Resampling. Not that using the test set for early stopping can potentially bias the performance scores. See the section on early stopping in the examples.

Initial parameter values

Super classes

mlr3::Learner -> mlr3::LearnerRegr -> LearnerRegrXgboost


Public methods

Inherited methods

Method new()

Creates a new instance of this R6 class.


Method importance()

The importance scores are calculated with xgboost::xgb.importance().


Named numeric().

Method clone()

The objects of this class are cloneable with this method.

LearnerRegrXgboost$clone(deep = FALSE)

Whether to make a deep clone.


To compute on GPUs, you first need to compile xgboost yourself and link against CUDA. See


Chen, Tianqi, Guestrin, Carlos (2016). “Xgboost: A scalable tree boosting system.” In Proceedings of the 22nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 785–794. ACM. doi:10.1145/2939672.2939785.

See Also

Other Learner: mlr_learners_classif.cv_glmnet, mlr_learners_classif.glmnet, mlr_learners_classif.kknn, mlr_learners_classif.lda, mlr_learners_classif.log_reg, mlr_learners_classif.multinom, mlr_learners_classif.naive_bayes, mlr_learners_classif.nnet, mlr_learners_classif.qda, mlr_learners_classif.ranger, mlr_learners_classif.svm, mlr_learners_classif.xgboost, mlr_learners_regr.cv_glmnet, mlr_learners_regr.glmnet, mlr_learners_regr.kknn,, mlr_learners_regr.lm, mlr_learners_regr.nnet, mlr_learners_regr.ranger, mlr_learners_regr.svm


## Not run: 
if (requireNamespace("xgboost", quietly = TRUE)) {
# Define the Learner and set parameter values
learner = lrn("regr.xgboost")

# Define a Task
task = tsk("mtcars")

# Create train and test set
ids = partition(task)

# Train the learner on the training ids
learner$train(task, row_ids = ids$train)

# print the model

# importance method
if("importance" %in% learner$properties) print(learner$importance)

# Make predictions for the test rows
predictions = learner$predict(task, row_ids = ids$test)

# Score the predictions

## End(Not run)

## Not run: 
# Train learner with early stopping on spam data set
task = tsk("mtcars")

# Split task into training and test set
split = partition(task, ratio = 0.8)
task$set_row_roles(split$test, "test")

# Set early stopping parameter
learner = lrn("regr.xgboost",
  nrounds = 100,
  early_stopping_rounds = 10,
  early_stopping_set = "test"

# Train learner with early stopping

## End(Not run)

[Package mlr3learners version 0.6.0 Index]