variableImportance {interpretR} | R Documentation |
Permutation- based Variable Importance Measures
Description
variableImportance
produces permutation- based variable importance measures (currently only for binary classification models from the package randomForest
and only for the performance measure AUROC)
Usage
variableImportance(
object = NULL,
xdata = NULL,
ydata = NULL,
CV = 3,
measure = "AUROC",
sort = TRUE
)
Arguments
object |
A model. Currently only binary classification models from the package |
xdata |
A data frame containing the predictors for the model. |
ydata |
A factor containing the response variable. |
CV |
Cross-validation. How many times should the data be permuted and the decrease in performance be calculated? Afterwards the mean is taken. CV should be higher for very small samples to ensure stability. |
measure |
Currently only Area Under the Receiver Operating Characteristic Curve (AUROC) is supported. |
sort |
Logical. Should the results be sorted from high to low? |
Details
Currently only binary classification models from randomForest
are supported. Also, currently only AUROC is supported. Definition of MeanDecreaseAUROC: for the entire ensemble the AUROC is recorded on the provided xdata. The same is subsequently done after permuting each variable (iteratively, for each variable separately). Then the latter is subtracted from the former. This is called the Decrease in AUROC. If we do this for multiple CV, it becomes the Mean Decrease in AUROC.
Value
A data frame containing the variable names and the mean decrease in AUROC
Author(s)
Authors: Michel Ballings, and Dirk Van den Poel, Maintainer: Michel.Ballings@GMail.com
See Also
Examples
#Prepare data
data(iris)
iris <- iris[1:100,]
iris$Species <- as.factor(ifelse(factor(iris$Species)=="setosa",0,1))
#Estimate model
library(randomForest)
ind <- sample(nrow(iris),50)
rf <- randomForest(Species~., iris[ind,])
#Obtain variable importances
variableImportance(object=rf, xdata=iris[-ind,names(iris) != "Species"],
ydata=iris[-ind,]$Species)