tree.interpreter {tree.interpreter} | R Documentation |
Random Forest Prediction Decomposition and Feature Importance Measure
Description
An R re-implementation of the 'treeinterpreter' package on PyPI. <https://pypi.org/project/treeinterpreter/>. Each prediction can be decomposed as 'prediction = bias + feature_1_contribution + ... + feature_n_contribution'. This decomposition is then used to calculate the Mean Decrease Impurity (MDI) and Mean Decrease Impurity using out-of-bag samples (MDI-oob) feature importance measures based on the work of Li et al. (2019) <arXiv:1906.10845>.
tidyRF
The function tidyRF
can turn a randomForest
or ranger
object into a package-agnostic random forest object. All other functions
in this package operate on such a tidyRF
object.
The featureContrib
and trainsetBias
families
The featureContrib
and trainsetBias
families can decompose the
prediction of regression/classification trees/forests into bias and feature
contribution components.
The MDI
and MDIoob
families
The MDI
family can calculate the good old MDI feature importance
measure, which unfortunately has some feature selection bias. MDI-oob is a
debiased MDI feature importance measure that has achieved state-of-the-art
performance in feature selection for both simulated and real data. It can be
calculated with functions from the MDIoob
family.
Examples
library(ranger)
rfobj <- ranger(mpg ~ ., mtcars, keep.inbag = TRUE)
tidy.RF <- tidyRF(rfobj, mtcars[, -1], mtcars[, 1])
MDIoob(tidy.RF, mtcars[, -1], mtcars[, 1])