pairs.ada {ada}R Documentation

Pairwise Plots and Variable Importancs Plot for Ada

Description

This command produces pairwise plots of the data. The data in the upper panel of pairwise plots colors the observations by observed class membership (if membership is provided). The lower panel of pairwise plots colors the observations by predicted classes. In addition, the plotting symbol is scaled by the the class probability estimate from by adaboost.

The varplot command produces a variable importance plot using the improve criteria given in the reference (Hastie et al.,2001, pg332). This is a rather standard measure for determining variable importance.

Usage


## S3 method for class 'ada'
pairs(x, train.data = NULL, vars = NULL, maxvar = 10, 
                    test.x = NULL, test.y = NULL, 
                    test.only = FALSE,col=c(2,4),pch=c(1,2), ...)

varplot(x, plot.it = TRUE, type = c("none","scores"),max.var.show=30, ...)

Arguments

x

object generated by ‘ada’.

train.data

the ‘data.frame’ of the orgianal data used to train the classifier. The names of this ‘data.frame’ must be the same as the variable names as the object generated by ‘ada’. x.data is used by both the ‘pairs’ command. Default = NULL.

vars

a vector of variables to include for this plot. The variable number must correspond to a specific column in ‘x’. For example, vars=c(1,2), generates a plot for the first two columns for ‘x.data’. Note: vars is only used for the ‘pairs’ command. Default = NULL.

maxvar

the maximum number of variables for the pairwise plot. If maxvar = 5, then ‘varplot’ chooses the the five most important variables and places these in desending order in the plot. Maxvar is only used for the ‘pairs’ command. Default = 10.

test.x

an option to plot pairwise descriptors for a test data set. ‘test.data’ should be of type ‘data.frame’. ‘test.data’ is only used for the ‘pairs’ command. Default = NULL.

test.y

the corresponding response for the test data set. If ‘test.response’ is not specified, then the color of the symbols for the test data in the pairwise plots are black; training data are colored by class. ‘test.response’ is only used for the ‘pairs’ command. Default = NULL.

test.only

provides pairwise plots for test data only (test.only = TRUE). Default = FALSE. If ‘test.response’ is not specified, then ‘test.only’ is ignored. ‘test.only’ is only used for the ‘pairs’ command. Default = NULL.

col

color for plot symbols one for each class. Defualt col=c(2,4) (i.e. red and blue)

pch

pch for plot set two symbols. Defualt pch=c(1,2) (i.e. circle and triangle)

...

Arguments to be passed into ‘pairs.default’. Do not set the upper and lower panel. This is only used for the pairs command.

plot.it

provides a plot of frequencies for each variable (plot.it = TRUE). ‘plot.it’ is only used for the ‘varplot’ command. Default = NULL.

type

if type=“none” then nothing is returned. Default = “none”. If type=“scores”, the frequencies are returned.

max.var.show

if plot.it is TRUE then this controls the number of variables shown for the plot

Details

The ‘varplot’ command provides a sense of variable importance–the more frequently a variable is selected for boosting, the more likely the variable contains useful information for classification. Pairwise interactions of important variables can then be visualized using ‘varplot’. Note: The ‘pairs’ command calls the ‘varplot’ command.

Value

scores

If type=“scores” then the frequencies for each variable is returned by the varplot command.

Note

This plot was designed as tool to use with adaboost. Please send any comments or suggestions for improvement to the authors.

References

Culp, M., Johnson, K., Michailidis, G. (200X). ada: an R Package for Boosting Journal of Statistical Software, (XX)XX


[Package ada version 2.0-5 Index]