R: Runs v-fold cross validation with AdaBoost.M1 or SAMME

boosting.cv {adabag}

R Documentation

Runs v-fold cross validation with AdaBoost.M1 or SAMME

Description

The data are divided into v non-overlapping subsets of roughly equal size. Then, boosting is applied on (v-1) of the subsets. Finally, predictions are made for the left out subsets, and the process is repeated for each of the v subsets.

Usage

boosting.cv(formula, data, v = 10, boos = TRUE, mfinal = 100, 
 coeflearn = "Breiman", control, par=FALSE)

Arguments

`formula`	a formula, as in the `lm` function.
`data`	a data frame in which to interpret the variables named in `formula`
`boos`	if `TRUE` (by default), a bootstrap sample of the training set is drawn using the weights for each observation on that iteration. If `FALSE`, every observation is used with its weights.
`v`	An integer, specifying the type of v-fold cross validation. Defaults to 10. If `v` is set as the number of observations, leave-one-out cross validation is carried out. Besides this, every value between two and the number of observations is valid and means that roughly every v-th observation is left out.
`mfinal`	an integer, the number of iterations for which boosting is run or the number of trees to use. Defaults to `mfinal=100` iterations.
`coeflearn`	if 'Breiman'(by default), `alpha=1/2ln((1-err)/err)` is used. If 'Freund' `alpha=ln((1-err)/err)` is used. In both cases the AdaBoost.M1 algorithm is used and `alpha` is the weight updating coefficient. On the other hand, if coeflearn is 'Zhu' the SAMME algorithm is implemented with `alpha=ln((1-err)/err)+` `ln(nclasses-1)`.
`control`	options that control details of the rpart algorithm. See rpart.control for more details.
`par`	if `TRUE`, the cross validation process is runned in parallel. If `FALSE` (by default), the function runs without parallelization.

Value

An object of class boosting.cv, which is a list with the following components:

`class`	the class predicted by the ensemble classifier.
`confusion`	the confusion matrix which compares the real class with the predicted one.
`error`	returns the average error.

Author(s)

Esteban Alfaro-Cortes Esteban.Alfaro@uclm.es, Matias Gamez-Martinez Matias.Gamez@uclm.es and Noelia Garcia-Rubio Noelia.Garcia@uclm.es

References

Alfaro, E., Gamez, M. and Garcia, N. (2013): “adabag: An R Package for Classification with Boosting and Bagging”. Journal of Statistical Software, Vol 54, 2, pp. 1–35.

Alfaro, E., Garcia, N., Gamez, M. and Elizondo, D. (2008): “Bankruptcy forecasting: An empirical comparison of AdaBoost and neural networks”. Decision Support Systems, 45, pp. 110–122.

Breiman, L. (1998): "Arcing classifiers". The Annals of Statistics, Vol 26, 3, pp. 801–849.

Freund, Y. and Schapire, R.E. (1996): "Experiments with a new boosting algorithm". In Proceedings of the Thirteenth International Conference on Machine Learning, pp. 148–156, Morgan Kaufmann.

Zhu, J., Zou, H., Rosset, S. and Hastie, T. (2009): “Multi-class AdaBoost”. Statistics and Its Interface, 2, pp. 349–360.

Examples


## rpart library should be loaded
data(iris)
iris.boostcv <- boosting.cv(Species ~ ., v=2, data=iris, mfinal=5, 
control=rpart.control(cp=0.01))
iris.boostcv[-1]

## rpart and mlbench libraries should be loaded
## Data Vehicle (four classes) 
#This example has been hidden to fulfill execution time <5s 
#data(Vehicle)
#Vehicle.boost.cv <- boosting.cv(Class ~.,data=Vehicle,v=5, mfinal=10, coeflearn="Zhu",
#control=rpart.control(maxdepth=5))
#Vehicle.boost.cv[-1]

[Package adabag version 5.0 Index]