| bst {bst} | R Documentation |
Boosting for Classification and Regression
Description
Gradient boosting for optimizing loss functions with componentwise linear, smoothing splines, tree models as base learners.
Usage
bst(x, y, cost = 0.5, family = c("gaussian", "hinge", "hinge2", "binom", "expo",
"poisson", "tgaussianDC", "thingeDC", "tbinomDC", "binomdDC", "texpoDC", "tpoissonDC",
"huber", "thuberDC", "clossR", "clossRMM", "closs", "gloss", "qloss", "clossMM",
"glossMM", "qlossMM", "lar"), ctrl = bst_control(), control.tree = list(maxdepth = 1),
learner = c("ls", "sm", "tree"))
## S3 method for class 'bst'
print(x, ...)
## S3 method for class 'bst'
predict(object, newdata=NULL, newy=NULL, mstop=NULL,
type=c("response", "all.res", "class", "loss", "error"), ...)
## S3 method for class 'bst'
plot(x, type = c("step", "norm"),...)
## S3 method for class 'bst'
coef(object, which=object$ctrl$mstop, ...)
## S3 method for class 'bst'
fpartial(object, mstop=NULL, newdata=NULL)
Arguments
x |
a data frame containing the variables in the model. |
y |
vector of responses. |
cost |
price to pay for false positive, 0 < |
family |
A variety of loss functions.
|
ctrl |
an object of class |
type |
|
control.tree |
control parameters of rpart. |
learner |
a character specifying the component-wise base learner to be used:
|
object |
class of |
newdata |
new data for prediction with the same number of columns as |
newy |
new response. |
mstop |
boosting iteration for prediction. |
which |
at which boosting |
... |
additional arguments. |
Details
Boosting algorithms for classification and regression problems. In a classification problem, suppose f is a classifier for a response y. A cost-sensitive or weighted loss function is
L(y,f,cost)=l(y,f,cost)\max(0, (1-yf))
For family="hinge",
l(y,f,cost)=
1-cost, if \, y= +1;
\quad cost, if \, y= -1
For family="hinge2",
l(y,f,cost)= 1, if y = +1 and f > 0 ; = 1-cost, if y = +1 and f < 0; = cost, if y = -1 and f > 0; = 1, if y = -1 and f < 0.
For twin boosting if twinboost=TRUE, there are two types of adaptive boosting if learner="ls": for twintype=1, weights are based on coefficients in the first round of boosting; for twintype=2, weights are based on predictions in the first round of boosting. See Buehlmann and Hothorn (2010).
Value
An object of class bst with print, coef,
plot and predict methods are available for linear models.
For nonlinear models, methods print and predict are available.
x, y, cost, family, learner, control.tree, maxdepth |
These are input variables and parameters |
ctrl |
the input |
yhat |
predicted function estimates |
ens |
a list of length |
ml.fit |
the last element of |
ensemble |
a vector of length |
xselect |
selected variables in |
coef |
estimated coefficients in each iteration. Used internally only |
Author(s)
Zhu Wang
References
Zhu Wang (2011), HingeBoost: ROC-Based Boost for Classification and Variable Selection. The International Journal of Biostatistics, 7(1), Article 13.
Peter Buehlmann and Torsten Hothorn (2010), Twin Boosting: improved feature selection and prediction, Statistics and Computing, 20, 119-138.
See Also
cv.bst for cross-validated stopping iteration. Furthermore see
bst_control
Examples
x <- matrix(rnorm(100*5),ncol=5)
c <- 2*x[,1]
p <- exp(c)/(exp(c)+exp(-c))
y <- rbinom(100,1,p)
y[y != 1] <- -1
x <- as.data.frame(x)
dat.m <- bst(x, y, ctrl = bst_control(mstop=50), family = "hinge", learner = "ls")
predict(dat.m)
dat.m1 <- bst(x, y, ctrl = bst_control(twinboost=TRUE,
coefir=coef(dat.m), xselect.init = dat.m$xselect, mstop=50))
dat.m2 <- rbst(x, y, ctrl = bst_control(mstop=50, s=0, trace=TRUE),
rfamily = "thinge", learner = "ls")
predict(dat.m2)