LogitBoost {ModTools} | R Documentation |
LogitBoost Classification Algorithm
Description
Train logitboost classification algorithm using decision stumps (one node decision trees) as weak learners.
Usage
LogitBoost(x, ...)
## S3 method for class 'formula'
LogitBoost(formula, data, ..., subset, na.action)
## Default S3 method:
LogitBoost(x, y, nIter=ncol(x), ...)
Arguments
formula |
a formula expression as for regression models, of the form |
data |
an optional data frame in which to interpret the variables occurring in formula. |
... |
additional arguments for nnet |
subset |
expression saying which subset of the rows of the data should be used in the fit. All observations are included by default. |
na.action |
a function to filter missing data. |
x |
A matrix or data frame with training data. Rows contain samples and columns contain features |
y |
Class labels for the training data samples.
A response vector with one label for each row/component of |
nIter |
An integer, describing the number of iterations for which boosting should be run, or number of decision stumps that will be used. |
Details
The function was adapted from logitboost.R function written by Marcel
Dettling. See references and "See Also" section. The code was modified in
order to make it much faster for very large data sets. The speed-up was
achieved by implementing a internal version of decision stump classifier
instead of using calls to rpart
. That way, some of the most time
consuming operations were precomputed once, instead of performing them at
each iteration. Another difference is that training and testing phases of the
classification process were split into separate functions.
Value
An object of class "LogitBoost" including components:
Stump |
List of decision stumps (one node decision trees) used:
If there are more than two classes, than several "Stumps" will be
|
lablist |
names of each class |
Author(s)
Jarek Tuszynski (SAIC) jaroslaw.w.tuszynski@saic.com
References
Dettling and Buhlmann (2002), Boosting for Tumor Classification of Gene Expression Data.
Examples
# basic interface
r.lb <- LogitBoost(Species ~ ., data=iris, nIter=20)
pred <- predict(r.lb)
prob <- predict(r.lb, type="prob")
d.res <- data.frame(pred, prob)
d.res[1:10, ]
# accuracy increases with nIter (at least for train set)
table(predict(r.lb, iris, type="class", nIter= 2), iris$Species)
table(predict(r.lb, iris, type="class", nIter=10), iris$Species)
table(predict(r.lb, iris, type="class"), iris$Species)
# example of spliting the data into train and test set
d.set <- SplitTrainTest(iris)
r.lb <- LogitBoost(Species ~ ., data=d.set$train, nIter=10)
table(predict(r.lb, d.set$test, type="class", nIter=2), d.set$test$Species)
table(predict(r.lb, d.set$test, type="class"), d.set$test$Species)