mbst {bst} | R Documentation |
Boosting for Multi-Classification
Description
Gradient boosting for optimizing multi-class loss functions with componentwise linear, smoothing splines, tree models as base learners.
Usage
mbst(x, y, cost = NULL, family = c("hinge", "hinge2", "thingeDC", "closs", "clossMM"),
ctrl = bst_control(), control.tree=list(fixed.depth=TRUE,
n.term.node=6, maxdepth = 1), learner = c("ls", "sm", "tree"))
## S3 method for class 'mbst'
print(x, ...)
## S3 method for class 'mbst'
predict(object, newdata=NULL, newy=NULL, mstop=NULL,
type=c("response", "class", "loss", "error"), ...)
## S3 method for class 'mbst'
fpartial(object, mstop=NULL, newdata=NULL)
Arguments
x |
a data frame containing the variables in the model. |
y |
vector of responses. |
cost |
price to pay for false positive, 0 < |
family |
|
ctrl |
an object of class |
control.tree |
control parameters of rpart. |
learner |
a character specifying the component-wise base learner to be used:
|
type |
in |
object |
class of |
newdata |
new data for prediction with the same number of columns as |
newy |
new response. |
mstop |
boosting iteration for prediction. |
... |
additional arguments. |
Details
A linear or nonlinear classifier is fitted using a boosting algorithm for multi-class responses. This function is different from mhingebst
on how to deal with zero-to-sum constraint and loss functions. If family="hinge"
, the loss function is the same as in mhingebst
but the boosting algorithm is different. If family="hinge2"
, the loss function is different from family="hinge"
: the response is not recoded as in Wang (2012). In this case, the loss function is
\sum{I(y_i \neq j)(f_j+1)_+}.
family="thingeDC"
for robust loss function used in the DCB algorithm.
Value
An object of class mbst
with print
, coef
,
plot
and predict
methods are available for linear models.
For nonlinear models, methods print
and predict
are available.
x , y , cost , family , learner , control.tree , maxdepth |
These are input variables and parameters |
ctrl |
the input |
yhat |
predicted function estimates |
ens |
a list of length |
ml.fit |
the last element of |
ensemble |
a vector of length |
xselect |
selected variables in |
coef |
estimated coefficients in each iteration. Used internally only |
Author(s)
Zhu Wang
References
Zhu Wang (2011), HingeBoost: ROC-Based Boost for Classification and Variable Selection. The International Journal of Biostatistics, 7(1), Article 13.
Zhu Wang (2012), Multi-class HingeBoost: Method and Application to the Classification of Cancer Types Using Gene Expression Data. Methods of Information in Medicine, 51(2), 162–7.
See Also
cv.mbst
for cross-validated stopping iteration. Furthermore see
bst_control
Examples
x <- matrix(rnorm(100*5),ncol=5)
c <- quantile(x[,1], prob=c(0.33, 0.67))
y <- rep(1, 100)
y[x[,1] > c[1] & x[,1] < c[2] ] <- 2
y[x[,1] > c[2]] <- 3
x <- as.data.frame(x)
dat.m <- mbst(x, y, ctrl = bst_control(mstop=50), family = "hinge", learner = "ls")
predict(dat.m)
dat.m1 <- mbst(x, y, ctrl = bst_control(twinboost=TRUE,
f.init=predict(dat.m), xselect.init = dat.m$xselect, mstop=50))
dat.m2 <- rmbst(x, y, ctrl = bst_control(mstop=50, s=1, trace=TRUE),
rfamily = "thinge", learner = "ls")
predict(dat.m2)