R: Classification tree evaluation by CV

treeEval {chemometrics}

R Documentation

Classification tree evaluation by CV

Description

Evaluation for classification trees by cross-validation

Usage

treeEval(X, grp, train, kfold = 10, cp = seq(0.01, 0.1, by = 0.01), plotit = TRUE, 
   legend = TRUE, legpos = "bottomright", ...)

Arguments

`X`	standardized complete X data matrix (training and test data)
`grp`	factor with groups for complete data (training and test data)
`train`	row indices of X indicating training data objects
`kfold`	number of folds for cross-validation
`cp`	range for tree complexity parameter, see `rpart`
`plotit`	if TRUE a plot will be generated
`legend`	if TRUE a legend will be added to the plot
`legpos`	positioning of the legend in the plot
`...`	additional plot arguments

Details

The data are split into a calibration and a test data set (provided by "train"). Within the calibration set "kfold"-fold CV is performed by applying the classification method to "kfold"-1 parts and evaluation for the last part. The misclassification error is then computed for the training data, for the CV test data (CV error) and for the test data.

Value

`trainerr`	training error rate
`testerr`	test error rate
`cvMean`	mean of CV errors
`cvSe`	standard error of CV errors
`cverr`	all errors from CV
`cp`	range for tree complexity parameter, taken from input

Author(s)

Peter Filzmoser <P.Filzmoser@tuwien.ac.at>

References

K. Varmuza and P. Filzmoser: Introduction to Multivariate Statistical Analysis in Chemometrics. CRC Press, Boca Raton, FL, 2009.

Examples

data(fgl,package="MASS")
grp=fgl$type
X=scale(fgl[,1:9])
k=length(unique(grp))
dat=data.frame(grp,X)
n=nrow(X)
ntrain=round(n*2/3)
require(rpart)
set.seed(123)
train=sample(1:n,ntrain)
par(mar=c(4,4,3,1))
restree=treeEval(X,grp,train,cp=c(0.01,0.02:0.05,0.1,0.15,0.2:0.5,1))
title("Classification trees")

[Package chemometrics version 1.4.4 Index]