orsf_vint {aorsf} | R Documentation |
Variable Interactions
Description
Use the variable interaction score described in Greenwell et al (2018).
As this method can be computationally demanding, using n_thread=0
can substantially reduce time needed to compute scores.
Usage
orsf_vint(
object,
predictors = NULL,
n_thread = NULL,
verbose_progress = NULL,
sep = ".."
)
Arguments
object |
(ObliqueForest) a trained oblique random forest object (see orsf) |
predictors |
(character) a vector of length 2 or more with names
of predictors used by |
n_thread |
(integer) number of threads to use while growing trees, computing predictions, and computing importance. Default is 0, which allows a suitable number of threads to be used based on availability. |
verbose_progress |
(logical) if |
sep |
(character) how to separate the names of two predictors.
The default value of |
Details
The number of possible interactions grows exponentially based on the
number of predictors. Some caution is warranted when using large predictor
sets and it is recommended that you supply a specific vector of predictor
names to assess rather than a global search. A good strategy is to use
n_tree = 5
to search all predictors, then pick the top 10 interactions,
get the unique predictors from them, and re-run on just those predictors
with more trees.
Value
a data.table with variable interaction scores and partial dependence values.
References
Greenwell, M B, Boehmke, C B, McCarthy, J A (2018). "A simple and effective model-based variable importance measure." arXiv preprint arXiv:1805.04755.
Examples
set.seed(329)
data <- data.frame(
x1 = rnorm(500),
x2 = rnorm(500),
x3 = rnorm(500)
)
data$y = with(data, expr = x1 + x2 + x3 + 1/2*x1 * x2 + x2 * x3 + rnorm(500))
forest <- orsf(data, y ~ ., n_tree = 5)
orsf_vint(forest)