rfPermute {rfPermute} | R Documentation |
Estimate Permutation p-values for Random Forest Importance Metrics
Description
Estimate significance of importance metrics for a Random Forest model by permuting the response variable. Produces null distribution of importance metrics for each predictor variable and p-value of observed.
Usage
rfPermute(x, ...)
## Default S3 method:
rfPermute(x, y = NULL, ..., num.rep = 100, num.cores = 1)
## S3 method for class 'formula'
rfPermute(
formula,
data = NULL,
...,
subset,
na.action = na.fail,
num.rep = 100,
num.cores = 1
)
as.randomForest(x)
## S3 method for class 'rfPermute'
print(x, ...)
## S3 method for class 'rfPermute'
predict(object, ...)
Arguments
x , y , formula , data , subset , na.action , ... |
See |
num.rep |
Number of permutation replicates to run to construct null distribution and calculate p-values (default = 100). |
num.cores |
Number of CPUs to distribute permutation results over.
Defaults to |
object |
an |
Details
All other parameters are as defined in randomForest.formula
.
A Random Forest model is first created as normal to calculate the observed
values of variable importance. The response variable is then permuted
num.rep
times, with a new Random Forest model built for each
permutation step.
Value
An rfPermute
object.
Author(s)
Eric Archer eric.archer@noaa.gov
Examples
# A regression model predicting ozone levels
data(airquality)
ozone.rp <- rfPermute(Ozone ~ ., data = airquality, na.action = na.omit, ntree = 100, num.rep = 50)
ozone.rp
# Plot the scaled importance distributions
# Significant (p <= 0.05) predictors are in red
plotImportance(ozone.rp, scale = TRUE)
# Plot the importance null distributions and observed values for two of the predictors
plotNull(ozone.rp, preds = c("Solar.R", "Month"))
# A classification model classifying cars to manual or automatic transmission
data(mtcars)
am.rp <- rfPermute(factor(am) ~ ., mtcars, ntree = 100, num.rep = 50)
summary(am.rp)
plotImportance(am.rp, scale = TRUE, sig.only = TRUE)