autoprune {adabag}R Documentation

Builds automatically a pruned tree of class rpart

Description

Builds automatically a pruned tree of class rpart looking in the cptable for the minimum cross validation error plus a standard deviation

Usage

autoprune(formula, data, subset=1:length(data[,1]), ...)

Arguments

formula

a formula, as in the lm function.

data

a data frame in which to interpret the variables named in the formula.

subset

optional expression saying that only a subset of the rows of the data should be used in the fit, as in the rpart function.

...

further arguments passed to or from other methods.

Details

The cross validation estimation of the error (xerror) has a random component. To avoid this randomness the 1-SE rule (or 1-SD rule) selects the simplest model with a xerror equal or less than the minimum xerror plus the standard deviation of the minimum xerror.

Value

An object of class rpart

Author(s)

Esteban Alfaro-Cortes Esteban.Alfaro@uclm.es, Matias Gamez-Martinez Matias.Gamez@uclm.es and Noelia Garcia-Rubio Noelia.Garcia@uclm.es

References

Breiman, L., Friedman, J.H., Olshen, R. and Stone, C.J. (1984): "Classification and Regression Trees". Wadsworth International Group. Belmont

Therneau, T., Atkinson, B. and Ripley, B. (2014). rpart: Recursive Partitioning and Regression Trees. R package version 4.1-5

See Also

rpart

Examples

## rpart library should be loaded
library(rpart)
data(iris)
iris.prune<-autoprune(Species~., data=iris)
iris.prune

## Comparing the test error of rpart and autoprune
library(mlbench)
data(BreastCancer)
l <- length(BreastCancer[,1])
sub <- sample(1:l,2*l/3)

BC.rpart <- rpart(Class~.,data=BreastCancer[sub,-1],cp=-1, maxdepth=5)
BC.rpart.pred <- predict(BC.rpart,newdata=BreastCancer[-sub,-1],type="class")
tb <-table(BC.rpart.pred,BreastCancer$Class[-sub])
tb
1-(sum(diag(tb))/sum(tb))


BC.prune<-autoprune(Class~.,data=BreastCancer[,-1],subset=sub)
BC.rpart.pred <- predict(BC.prune,newdata=BreastCancer[-sub,-1],type="class")
tb <-table(BC.rpart.pred,BreastCancer$Class[-sub])
tb
1-(sum(diag(tb))/sum(tb))




[Package adabag version 5.0 Index]