randomVarImpsRFplot {varSelRF} | R Documentation |
Plot random random variable importances
Description
Plot variable importances from random permutations of class labels and the variable importances from the original data set.
Usage
randomVarImpsRFplot(randomImportances, forest,
whichImp = "impsUnscaled", nvars = NULL,
show.var.names = FALSE, vars.highlight = NULL,
main = NULL, screeRandom = TRUE,
lwdBlack = 1.5,
lwdRed = 2,
lwdLightblue = 1,
cexPoint = 1,
overlayTrue = FALSE,
xlab = NULL,
ylab = NULL, ...)
Arguments
randomImportances |
A list with a structure such as the object
return by |
.
forest |
A random forest fitted to the original data. This forest
must have been fitted with |
whichImp |
The importance measue to use. One (only one) of
|
nvars |
If NULL will show the plot for the complete range of variables. If an integer, will plot only the most important nvars. |
show.var.names |
If TRUE, show the variable names in the plot. Unless you are plotting few variables, it probably won't be of any use. |
vars.highlight |
A vector indicating the variables to highlight in the plot with a vertical blue segment. You need to pass here a vector of variable names, not variable positions. |
main |
The title for the plot. |
screeRandom |
If TRUE, order all the variable importances (i.e., those from both the original and the permuted class labels data sets) from largest to smallest before plotting. The plot will thus resemble a usual "scree plot". |
lwdBlack |
The width of the line to use for the importances from the original data set. |
lwdRed |
The width of the line to use for the average of the importances for the permuted data sets. |
lwdLightblue |
The width of the line for the importances for the individual permuted data sets. |
cexPoint |
|
overlayTrue |
If TRUE, the variable importance from the original data set will be plotted last, so you can see it even if buried in the middle of many gree lines; can be of help when the plot does not allow you to see the black line. |
xlab |
The title for the x-axis (see |
ylab |
The title for the y-axis (see |
... |
Additional arguments to plot. |
Value
Only used for its side effects of producing plots. In particular, you will see lines of three colors:
black |
Connects the variable importances from the original simulated data. |
green |
Connect the variable
importances from the data sets with permuted class labels; there
will be as many lines as |
red |
Connects the average of the importances from the permuted data sets. |
Additionally, if you used a valid set of values for
vars.highlight
, these will be shown with a vertical blue
segment.
Note
These plots resemble the scree plots commonly used with principal component analysis, and the actual choice of colors was taken from the importance spectrum plots of Friedman \& Meulman.
Author(s)
Ramon Diaz-Uriarte rdiaz02@gmail.com
References
Breiman, L. (2001) Random forests. Machine Learning, 45, 5–32.
Diaz-Uriarte, R. , Alvarez de Andres, S. (2005) Variable selection from random forests: application to gene expression data. Tech. report. http://ligarto.org/rdiaz/Papers/rfVS/randomForestVarSel.html
Friedman, J., Meulman, J. (2005) Clustering objects on subsets of attributes (with discussion). J. Royal Statistical Society, Series B, 66, 815–850.
See Also
randomForest
,
varSelRF
,
varSelRFBoot
,
varSelImpSpecRF
,
randomVarImpsRF
Examples
x <- matrix(rnorm(45 * 30), ncol = 30)
x[1:20, 1:2] <- x[1:20, 1:2] + 2
colnames(x) <- paste0("V", seq.int(ncol(x)))
cl <- factor(c(rep("A", 20), rep("B", 25)))
rf <- randomForest(x, cl, ntree = 200, importance = TRUE)
rf.rvi <- randomVarImpsRF(x, cl,
rf,
numrandom = 20,
usingCluster = FALSE)
randomVarImpsRFplot(rf.rvi, rf)
op <- par(las = 2)
randomVarImpsRFplot(rf.rvi, rf, show.var.names = TRUE)
par(op)
## Not run:
## identical, but using a cluster
## make a small cluster, for the sake of illustration
psockCL <- makeCluster(2, "PSOCK")
clusterSetRNGStream(psockCL, iseed = 789)
clusterEvalQ(psockCL, library(varSelRF))
rf.rvi <- randomVarImpsRF(x, cl,
rf,
numrandom = 20,
usingCluster = TRUE,
TheCluster = psockCL)
randomVarImpsRFplot(rf.rvi, rf)
stopCluster(psockCL)
## End(Not run)