blackboost {mboost} | R Documentation |
Gradient Boosting with Regression Trees
Description
Gradient boosting for optimizing arbitrary loss functions where regression trees are utilized as base-learners.
Usage
blackboost(formula, data = list(),
weights = NULL, na.action = na.pass,
offset = NULL, family = Gaussian(),
control = boost_control(),
oobweights = NULL,
tree_controls = partykit::ctree_control(
teststat = "quad",
testtype = "Teststatistic",
mincriterion = 0,
minsplit = 10,
minbucket = 4,
maxdepth = 2,
saveinfo = FALSE),
...)
Arguments
formula |
a symbolic description of the model to be fit. |
data |
a data frame containing the variables in the model. |
weights |
an optional vector of weights to be used in the fitting process. |
na.action |
a function which indicates what should happen when the data
contain |
offset |
a numeric vector to be used as offset (optional). |
family |
a |
control |
a list of parameters controlling the algorithm. For
more details see |
oobweights |
an additional vector of out-of-bag weights, which is
used for the out-of-bag risk (i.e., if |
tree_controls |
an object of class |
... |
additional arguments passed to |
Details
This function implements the ‘classical’
gradient boosting utilizing regression trees as base-learners.
Essentially, the same algorithm is implemented in package
gbm
. The
main difference is that arbitrary loss functions to be optimized
can be specified via the family
argument to blackboost
whereas
gbm
uses hard-coded loss functions.
Moreover, the base-learners (conditional
inference trees, see ctree
) are a little bit more flexible.
The regression fit is a black box prediction machine and thus hardly interpretable.
Partial dependency plots are not yet available; see example section for plotting of additive tree models.
Value
An object of class mboost
with print
and predict
methods being available.
References
Peter Buehlmann and Torsten Hothorn (2007), Boosting algorithms: regularization, prediction and model fitting. Statistical Science, 22(4), 477–505.
Torsten Hothorn, Kurt Hornik and Achim Zeileis (2006). Unbiased recursive partitioning: A conditional inference framework. Journal of Computational and Graphical Statistics, 15(3), 651–674.
Yoav Freund and Robert E. Schapire (1996), Experiments with a new boosting algorithm. In Machine Learning: Proc. Thirteenth International Conference, 148–156.
Jerome H. Friedman (2001), Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29, 1189–1232.
Greg Ridgeway (1999), The state of boosting. Computing Science and Statistics, 31, 172–181.
See Also
See mboost_fit
for the generic boosting function,
glmboost
for boosted linear models, and
gamboost
for boosted additive models.
See baselearners
for possible base-learners.
See cvrisk
for cross-validated stopping iteration.
Furthermore see boost_control
, Family
and
methods
.
Examples
### a simple two-dimensional example: cars data
cars.gb <- blackboost(dist ~ speed, data = cars,
control = boost_control(mstop = 50))
cars.gb
### plot fit
plot(dist ~ speed, data = cars)
lines(cars$speed, predict(cars.gb), col = "red")
### set up and plot additive tree model
if (require("partykit")) {
ctrl <- ctree_control(maxdepth = 3)
viris <- subset(iris, Species != "setosa")
viris$Species <- viris$Species[, drop = TRUE]
imod <- mboost(Species ~ btree(Sepal.Length, tree_controls = ctrl) +
btree(Sepal.Width, tree_controls = ctrl) +
btree(Petal.Length, tree_controls = ctrl) +
btree(Petal.Width, tree_controls = ctrl),
data = viris, family = Binomial())[500]
layout(matrix(1:4, ncol = 2))
plot(imod)
}