pairs.ada {ada} | R Documentation |
Pairwise Plots and Variable Importancs Plot for Ada
Description
This command produces pairwise plots of the data. The data in the upper panel of pairwise plots colors the observations by observed class membership (if membership is provided). The lower panel of pairwise plots colors the observations by predicted classes. In addition, the plotting symbol is scaled by the the class probability estimate from by adaboost.
The varplot
command produces a variable importance plot using the
improve criteria given in the reference (Hastie et al.,2001, pg332). This
is a rather standard measure for determining variable importance.
Usage
## S3 method for class 'ada'
pairs(x, train.data = NULL, vars = NULL, maxvar = 10,
test.x = NULL, test.y = NULL,
test.only = FALSE,col=c(2,4),pch=c(1,2), ...)
varplot(x, plot.it = TRUE, type = c("none","scores"),max.var.show=30, ...)
Arguments
x |
object generated by ‘ada’. |
train.data |
the ‘data.frame’ of the orgianal data used to train the classifier. The names of this ‘data.frame’ must be the same as the variable names as the object generated by ‘ada’. x.data is used by both the ‘pairs’ command. Default = NULL. |
vars |
a vector of variables to include for this plot. The variable number must correspond to a specific column in ‘x’. For example, vars=c(1,2), generates a plot for the first two columns for ‘x.data’. Note: vars is only used for the ‘pairs’ command. Default = NULL. |
maxvar |
the maximum number of variables for the pairwise plot. If maxvar = 5, then ‘varplot’ chooses the the five most important variables and places these in desending order in the plot. Maxvar is only used for the ‘pairs’ command. Default = 10. |
test.x |
an option to plot pairwise descriptors for a test data set. ‘test.data’ should be of type ‘data.frame’. ‘test.data’ is only used for the ‘pairs’ command. Default = NULL. |
test.y |
the corresponding response for the test data set. If ‘test.response’ is not specified, then the color of the symbols for the test data in the pairwise plots are black; training data are colored by class. ‘test.response’ is only used for the ‘pairs’ command. Default = NULL. |
test.only |
provides pairwise plots for test data only (test.only = TRUE). Default = FALSE. If ‘test.response’ is not specified, then ‘test.only’ is ignored. ‘test.only’ is only used for the ‘pairs’ command. Default = NULL. |
col |
color for plot symbols one for each class. Defualt col=c(2,4) (i.e. red and blue) |
pch |
pch for plot set two symbols. Defualt pch=c(1,2) (i.e. circle and triangle) |
... |
Arguments to be passed into ‘pairs.default’. Do not set the upper and lower panel. This is only used for the pairs command. |
plot.it |
provides a plot of frequencies for each variable (plot.it = TRUE). ‘plot.it’ is only used for the ‘varplot’ command. Default = NULL. |
type |
if type=“none” then nothing is returned. Default = “none”. If type=“scores”, the frequencies are returned. |
max.var.show |
if plot.it is TRUE then this controls the number of variables shown for the plot |
Details
The ‘varplot’ command provides a sense of variable importance–the more frequently a variable is selected for boosting, the more likely the variable contains useful information for classification. Pairwise interactions of important variables can then be visualized using ‘varplot’. Note: The ‘pairs’ command calls the ‘varplot’ command.
Value
scores |
If type=“scores” then the frequencies for each variable is returned by the varplot command. |
Note
This plot was designed as tool to use with adaboost. Please send any comments or suggestions for improvement to the authors.
References
Culp, M., Johnson, K., Michailidis, G. (200X). ada: an R Package for Boosting Journal of Statistical Software, (XX)XX