stackgbm {stackgbm}R Documentation

Model stacking for boosted trees

Description

Model stacking with a two-layer architecture: first layer being boosted tree models fitted by xgboost, lightgbm, and catboost; second layer being a logistic regression model.

Usage

stackgbm(x, y, params, n_folds = 5L, seed = 42, verbose = TRUE)

Arguments

x

Predictor matrix.

y

Response vector.

params

A list of optimal parameter objects for boosted tree models derived from cv_xgboost(), cv_lightgbm(), and cv_catboost(). The order does not matter.

n_folds

Number of folds. Default is 5.

seed

Random seed for reproducibility.

verbose

Show progress?

Value

Fitted boosted tree models and stacked tree model.

Examples


sim_data <- msaenet::msaenet.sim.binomial(
  n = 1000,
  p = 50,
  rho = 0.6,
  coef = rnorm(25, mean = 0, sd = 10),
  snr = 1,
  p.train = 0.8,
  seed = 42
)

params_xgboost <- structure(
  list("nrounds" = 200, "eta" = 0.05, "max_depth" = 3),
  class = c("cv_params", "cv_xgboost")
)
params_lightgbm <- structure(
  list("num_iterations" = 200, "max_depth" = 3, "learning_rate" = 0.05),
  class = c("cv_params", "cv_lightgbm")
)
params_catboost <- structure(
  list("iterations" = 100, "depth" = 3),
  class = c("cv_params", "cv_catboost")
)

fit <- stackgbm(
  sim_data$x.tr,
  sim_data$y.tr,
  params = list(
    params_xgboost,
    params_lightgbm,
    params_catboost
  )
)

predict(fit, newx = sim_data$x.te)


[Package stackgbm version 0.1.0 Index]