xgb_train_offset {offsetreg} | R Documentation |
Boosted Poisson Trees with Offsets via xgboost
Description
xgb_train_offset()
and xgb_predict_offset()
are wrappers for xgboost
tree-based models where all of the model arguments are in the main function.
These functions are nearly identical to the parsnip functions
parsnip::xgb_train()
and parsnip::xg_predict_offset()
except that the
objective "count:poisson" is passed to xgboost::xgb.train()
and an offset
term is added to the data set.
Usage
xgb_train_offset(
x,
y,
offset_col = "offset",
weights = NULL,
max_depth = 6,
nrounds = 15,
eta = 0.3,
colsample_bynode = NULL,
colsample_bytree = NULL,
min_child_weight = 1,
gamma = 0,
subsample = 1,
validation = 0,
early_stop = NULL,
counts = TRUE,
...
)
xgb_predict_offset(object, new_data, offset_col = "offset", ...)
Arguments
x |
A data frame or matrix of predictors |
y |
A vector (numeric) or matrix (numeric) of outcome data. |
offset_col |
Character string. The name of a column in |
weights |
A numeric vector of weights. |
max_depth |
An integer for the maximum depth of the tree. |
nrounds |
An integer for the number of boosting iterations. |
eta |
A numeric value between zero and one to control the learning rate. |
colsample_bynode |
Subsampling proportion of columns for each node
within each tree. See the |
colsample_bytree |
Subsampling proportion of columns for each tree.
See the |
min_child_weight |
A numeric value for the minimum sum of instance weights needed in a child to continue to split. |
gamma |
A number for the minimum loss reduction required to make a further partition on a leaf node of the tree |
subsample |
Subsampling proportion of rows. By default, all of the training data are used. |
validation |
The proportion of the data that are used for performance assessment and potential early stopping. |
early_stop |
An integer or |
counts |
A logical. If |
... |
Other options to pass to |
object |
An |
new_data |
New data for predictions. Can be a data frame, matrix,
|
Value
A fitted xgboost
object.
Examples
us_deaths$off <- log(us_deaths$population)
x <- model.matrix(~ age_group + gender + off, us_deaths)[, -1]
mod <- xgb_train_offset(x, us_deaths$deaths, "off",
eta = 1, colsample_bynode = 1,
max_depth = 2, nrounds = 25,
counts = FALSE)
xgb_predict_offset(mod, x, "off")