rbst {bst}  R Documentation 
MM (majorization/minimization) algorithm based gradient boosting for optimizing nonconvex robust loss functions with componentwise linear, smoothing splines, tree models as base learners.
rbst(x, y, cost = 0.5, rfamily = c("tgaussian", "thuber","thinge", "tbinom", "binomd", "texpo", "tpoisson", "clossR", "closs", "gloss", "qloss"), ctrl=bst_control(), control.tree=list(maxdepth = 1), learner=c("ls","sm","tree"),del=1e10)
x 
a data frame containing the variables in the model. 
y 
vector of responses. 
cost 
price to pay for false positive, 0 < 
rfamily 
robust loss function, see details. 
ctrl 
an object of class 
control.tree 
control parameters of rpart. 
learner 
a character specifying the componentwise base learner to be used:

del 
convergency criteria 
An MM algorithm operates by creating a convex surrogate function that majorizes the nonconvex objective function. When the surrogate function is minimized with gradient boosting algorithm, the desired objective function is decreased. The MM algorithm contains difference of convex (DC) algorithm for rfamily=c("tgaussian", "thuber","thinge", "tbinom", "binomd", "texpo", "tpoisson")
and quadratic majorization boosting algorithm (QMBA) for rfamily=c("clossR", "closs", "gloss", "qloss")
.
rfamily
= "tgaussian" for truncated square error loss, "thuber" for truncated Huber loss, "thinge" for truncated hinge loss, "tbinom" for truncated logistic loss, "binomd" for logistic difference loss, "texpo" for truncated exponential loss, "tpoisson" for truncated Poisson loss, "clossR" for Closs in regression, "closs" for Closs in classification, "gloss" for Gloss, "qloss" for Qloss.
s
must be a numeric value to be specified in bst_control
. For rfamily="thinge", "tbinom", "texpo"
s < 0
. For rfamily="binomd", "tpoisson", "closs", "qloss", "clossR"
, s > 0
and for rfamily="gloss"
, s > 1
. Some suggested s
values: "thinge"= 1, "tbinom"= log(3), "binomd"= log(4), "texpo"= log(0.5), "closs"=1, "gloss"=1.5, "qloss"=2, "clossR"=1.
An object of class bst
with print
, coef
,
plot
and predict
methods are available for linear models.
For nonlinear models, methods print
and predict
are available.
x, y, cost, rfamily, learner, control.tree, maxdepth 
These are input variables and parameters 
ctrl 
the input 
yhat 
predicted function estimates 
ens 
a list of length 
ml.fit 
the last element of 
ensemble 
a vector of length 
xselect 
selected variables in 
coef 
estimated coefficients in 
Zhu Wang
Zhu Wang (2018), Quadratic Majorization for Nonconvex Loss with Applications to the Boosting Algorithm, Journal of Computational and Graphical Statistics, 27(3), 491502, https://doi.org/10.1080/10618600.2018.1424635
Zhu Wang (2018), Robust boosting with truncated loss functions, Electronic Journal of Statistics, 12(1), 599650, https://doi.org/10.1214/18EJS1404
cv.rbst
for crossvalidated stopping iteration. Furthermore see
bst_control
x < matrix(rnorm(100*5),ncol=5) c < 2*x[,1] p < exp(c)/(exp(c)+exp(c)) y < rbinom(100,1,p) y[y != 1] < 1 y[1:10] < y[1:10] x < as.data.frame(x) dat.m < bst(x, y, ctrl = bst_control(mstop=50), family = "hinge", learner = "ls") predict(dat.m) dat.m1 < bst(x, y, ctrl = bst_control(twinboost=TRUE, coefir=coef(dat.m), xselect.init = dat.m$xselect, mstop=50)) dat.m2 < rbst(x, y, ctrl = bst_control(mstop=50, s=0, trace=TRUE), rfamily = "thinge", learner = "ls") predict(dat.m2)