randomVarImpsRF {varSelRF} | R Documentation |
Variable importances from random forest on permuted class labels
Description
Return variable importances from random forests fitted to data sets like the original except class labels have been randomly permuted.
Usage
randomVarImpsRF(xdata, Class, forest, numrandom = 100,
whichImp = "impsUnscaled", usingCluster = TRUE,
TheCluster = NULL, ...)
Arguments
xdata |
A data frame or matrix, with subjects/cases in rows and variables in columns. NAs not allowed. |
Class |
The dependent variable; must be a factor. |
forest |
A previously fitted random forest (see |
numrandom |
The number of random permutations of the class labels. |
whichImp |
A vector of one or more of |
usingCluster |
If TRUE use a cluster to parallelize the calculations. |
TheCluster |
The name of the cluster, if one is used. |
... |
Not used. |
Details
The measure of variable importance most often used is based on the decrease
of classification accuracy when values of a variable in a node of a
tree are permuted randomly (see references);
we use the unscaled version —see our paper and supplementary
material. Note that, by default, importance
returns the scaled
version.
Value
An object of class randomVarImpsRF, which is a list with one to three named components. The name of each component corresponds to the types of variable importance measures selected (i.e., impsUnscaled, impsScaled, impsGini).
Each component is a matrix, of dimensions number of variables by
numrandom
; each element (i,j)
of this matrix is the variable
importance for variable i
and random permutation j
.
Author(s)
Ramon Diaz-Uriarte rdiaz02@gmail.com
References
Breiman, L. (2001) Random forests. Machine Learning, 45, 5–32.
Diaz-Uriarte, R. and Alvarez de Andres, S. (2005) Variable selection from random forests: application to gene expression data. Tech. report. http://ligarto.org/rdiaz/Papers/rfVS/randomForestVarSel.html
Svetnik, V., Liaw, A. , Tong, C & Wang, T. (2004) Application of Breiman's random forest to modeling structure-activity relationships of pharmaceutical molecules. Pp. 334-343 in F. Roli, J. Kittler, and T. Windeatt (eds.). Multiple Classier Systems, Fifth International Workshop, MCS 2004, Proceedings, 9-11 June 2004, Cagliari, Italy. Lecture Notes in Computer Science, vol. 3077. Berlin: Springer.
See Also
randomForest
,
varSelRF
,
varSelRFBoot
,
varSelImpSpecRF
,
randomVarImpsRFplot
Examples
x <- matrix(rnorm(45 * 30), ncol = 30)
x[1:20, 1:2] <- x[1:20, 1:2] + 2
cl <- factor(c(rep("A", 20), rep("B", 25)))
rf <- randomForest(x, cl, ntree = 200, importance = TRUE)
rf.rvi <- randomVarImpsRF(x, cl,
rf,
numrandom = 20,
usingCluster = FALSE)
randomVarImpsRFplot(rf.rvi, rf)