R: Permutation- based Variable Importance Measures

variableImportance {interpretR}

R Documentation

Permutation- based Variable Importance Measures

Description

variableImportance produces permutation- based variable importance measures (currently only for binary classification models from the package randomForest and only for the performance measure AUROC)

Usage

variableImportance(
  object = NULL,
  xdata = NULL,
  ydata = NULL,
  CV = 3,
  measure = "AUROC",
  sort = TRUE
)

Arguments

`object`	A model. Currently only binary classification models from the package `randomForest`.
`xdata`	A data frame containing the predictors for the model.
`ydata`	A factor containing the response variable.
`CV`	Cross-validation. How many times should the data be permuted and the decrease in performance be calculated? Afterwards the mean is taken. CV should be higher for very small samples to ensure stability.
`measure`	Currently only Area Under the Receiver Operating Characteristic Curve (AUROC) is supported.
`sort`	Logical. Should the results be sorted from high to low?

Details

Currently only binary classification models from randomForest are supported. Also, currently only AUROC is supported. Definition of MeanDecreaseAUROC: for the entire ensemble the AUROC is recorded on the provided xdata. The same is subsequently done after permuting each variable (iteratively, for each variable separately). Then the latter is subtracted from the former. This is called the Decrease in AUROC. If we do this for multiple CV, it becomes the Mean Decrease in AUROC.

Value

A data frame containing the variable names and the mean decrease in AUROC

Author(s)

Authors: Michel Ballings, and Dirk Van den Poel, Maintainer: Michel.Ballings@GMail.com

Examples

#Prepare data
data(iris)
iris <- iris[1:100,]
iris$Species <- as.factor(ifelse(factor(iris$Species)=="setosa",0,1))
#Estimate model
library(randomForest)
ind <- sample(nrow(iris),50)
rf <- randomForest(Species~., iris[ind,])
#Obtain variable importances
variableImportance(object=rf, xdata=iris[-ind,names(iris) != "Species"],
ydata=iris[-ind,]$Species)

[Package interpretR version 0.2.5 Index]