rf.interaction.transformer {PDtoolkit} | R Documentation |
Extract interactions from random forest
Description
rf.interaction.transformer
extracts the interactions from random forest.
It implements customized random forest algorithm that takes into account different conditions (for single decision tree) such as minimum
percentage of observations and defaults in each node, maximum tree depth and monotonicity condition
at each splitting node. Gini index is used as metric for node splitting .
Usage
rf.interaction.transformer(
db,
rf,
target,
num.rf = NA,
num.tree,
min.pct.obs,
min.avg.rate,
max.depth,
monotonicity,
create.interaction.rf,
seed = 991
)
Arguments
db |
Data frame of risk factors and target variable supplied for interaction extraction. |
rf |
Character vector of risk factor names on which decision tree is run. |
target |
Name of target variable (default indicator 0/1) within db argument. |
num.rf |
Number of risk factors randomly selected for each decision tree. If default value ( |
num.tree |
Number of decision trees used for random forest. |
min.pct.obs |
Minimum percentage of observation in each leaf. |
min.avg.rate |
Minimum percentage of defaults in each leaf. |
max.depth |
Maximum number of splits. |
monotonicity |
Logical indicator. If |
create.interaction.rf |
Logical indicator. If |
seed |
Random seed to ensure result reproducibility. |
Value
The command rf.interaction.transformer
returns a list of two data frames. The first data frame provides
the trees summary. The second data frame is a new risk factor extracted from random forest.
Examples
#modify risk factors in order to show how the function works with missing values
loans$"Account Balance"[1:10] <- NA
loans$"Duration of Credit (month)"[c(13, 15)] <- NA
rf.it <- rf.interaction.transformer(db = loans,
rf = names(loans)[!names(loans)%in%"Creditability"],
target = "Creditability",
num.rf = NA,
num.tree = 3,
min.pct.obs = 0.05,
min.avg.rate = 0.01,
max.depth = 2,
monotonicity = TRUE,
create.interaction.rf = TRUE,
seed = 579)
names(rf.it)
rf.it[["tree.info"]]
tail(rf.it[["interaction"]])
table(rf.it[["interaction"]][, 1], useNA = "always")