GiniImportanceForest {rfVarImpOOB} | R Documentation |
computes inbag and OOB Gini importance averaged over all trees in a forest
Description
workhorse function of this package
Usage
GiniImportanceForest(RF, data, ylabel = "Survived", zeroLeaf = TRUE,
agg = c("mean", "median", "none")[1], score = c("PMDI21",
"MDI", "MDA", "MIA")[1], Predictor = Mode, verbose = 0)
Arguments
RF |
object returned by call to randomForest() |
data |
data which was used to train the RF. NOTE: assumes setting of inbag=TRUE while training |
ylabel |
name of dependent variable |
zeroLeaf |
if TRUE discard the information gain due to splits resulting in n=1 |
agg |
method of aggregating importance scores across trees. If "none" return the raw arrays (for debugging) |
score |
scoring method:MDI=mean decrease impurity (Gini),MDA=mean decrease accuracy (permutation),MIA=mean increase accuracy |
Predictor |
function to estimate node prediction, such as Mode or mean or median. Alternatively, pass an array of numbers as replacement for the yHat column of tree |
verbose |
level of verbosity |
Value
matrix with variable importance scores and their stdevs
Author(s)
Markus Loecher <Markus.Loecher@gmail.com>
Examples
data("titanic_train", package = "rfVarImpOOB", envir = environment())
set.seed(123)
ranRows=sample(nrow(titanic_train), 300)
data=titanic_train[ranRows,]
RF = randomForest::randomForest(formula = Survived ~ Sex + Pclass + PassengerId,
data=data,
ntree=5,importance=TRUE,
mtry=3,keep.inbag=TRUE,
nodesize = 20)
data$Survived = as.numeric(data$Survived)-1
VI_Titanic = GiniImportanceForest(RF, data,ylab="Survived")
[Package rfVarImpOOB version 1.0.3 Index]