bst {bst} | R Documentation |
Boosting for Classification and Regression
Description
Gradient boosting for optimizing loss functions with componentwise linear, smoothing splines, tree models as base learners.
Usage
bst(x, y, cost = 0.5, family = c("gaussian", "hinge", "hinge2", "binom", "expo",
"poisson", "tgaussianDC", "thingeDC", "tbinomDC", "binomdDC", "texpoDC", "tpoissonDC",
"huber", "thuberDC", "clossR", "clossRMM", "closs", "gloss", "qloss", "clossMM",
"glossMM", "qlossMM", "lar"), ctrl = bst_control(), control.tree = list(maxdepth = 1),
learner = c("ls", "sm", "tree"))
## S3 method for class 'bst'
print(x, ...)
## S3 method for class 'bst'
predict(object, newdata=NULL, newy=NULL, mstop=NULL,
type=c("response", "all.res", "class", "loss", "error"), ...)
## S3 method for class 'bst'
plot(x, type = c("step", "norm"),...)
## S3 method for class 'bst'
coef(object, which=object$ctrl$mstop, ...)
## S3 method for class 'bst'
fpartial(object, mstop=NULL, newdata=NULL)
Arguments
x |
a data frame containing the variables in the model. |
y |
vector of responses. |
cost |
price to pay for false positive, 0 < |
family |
A variety of loss functions.
|
ctrl |
an object of class |
type |
|
control.tree |
control parameters of rpart. |
learner |
a character specifying the component-wise base learner to be used:
|
object |
class of |
newdata |
new data for prediction with the same number of columns as |
newy |
new response. |
mstop |
boosting iteration for prediction. |
which |
at which boosting |
... |
additional arguments. |
Details
Boosting algorithms for classification and regression problems. In a classification problem, suppose f
is a classifier for a response y
. A cost-sensitive or weighted loss function is
L(y,f,cost)=l(y,f,cost)\max(0, (1-yf))
For family="hinge"
,
l(y,f,cost)=
1-cost, if \, y= +1;
\quad cost, if \, y= -1
For family="hinge2"
,
l(y,f,cost)= 1, if y = +1 and f > 0 ; = 1-cost, if y = +1 and f < 0; = cost, if y = -1 and f > 0; = 1, if y = -1 and f < 0.
For twin boosting if twinboost=TRUE
, there are two types of adaptive boosting if learner="ls"
: for twintype=1
, weights are based on coefficients in the first round of boosting; for twintype=2
, weights are based on predictions in the first round of boosting. See Buehlmann and Hothorn (2010).
Value
An object of class bst
with print
, coef
,
plot
and predict
methods are available for linear models.
For nonlinear models, methods print
and predict
are available.
x , y , cost , family , learner , control.tree , maxdepth |
These are input variables and parameters |
ctrl |
the input |
yhat |
predicted function estimates |
ens |
a list of length |
ml.fit |
the last element of |
ensemble |
a vector of length |
xselect |
selected variables in |
coef |
estimated coefficients in each iteration. Used internally only |
Author(s)
Zhu Wang
References
Zhu Wang (2011), HingeBoost: ROC-Based Boost for Classification and Variable Selection. The International Journal of Biostatistics, 7(1), Article 13.
Peter Buehlmann and Torsten Hothorn (2010), Twin Boosting: improved feature selection and prediction, Statistics and Computing, 20, 119-138.
See Also
cv.bst
for cross-validated stopping iteration. Furthermore see
bst_control
Examples
x <- matrix(rnorm(100*5),ncol=5)
c <- 2*x[,1]
p <- exp(c)/(exp(c)+exp(-c))
y <- rbinom(100,1,p)
y[y != 1] <- -1
x <- as.data.frame(x)
dat.m <- bst(x, y, ctrl = bst_control(mstop=50), family = "hinge", learner = "ls")
predict(dat.m)
dat.m1 <- bst(x, y, ctrl = bst_control(twinboost=TRUE,
coefir=coef(dat.m), xselect.init = dat.m$xselect, mstop=50))
dat.m2 <- rbst(x, y, ctrl = bst_control(mstop=50, s=0, trace=TRUE),
rfamily = "thinge", learner = "ls")
predict(dat.m2)