decision.tree {PDtoolkit} | R Documentation |
Custom decision tree algorithm
Description
decision.tree
runs customized decision tree algorithm. Customization refers to minimum
percentage of observations and defaults in each node, maximum tree depth, monotonicity condition
at each splitting node and statistical test (test of two proportions) used for node splitting.
Usage
decision.tree(
db,
rf,
target,
min.pct.obs = 0.05,
min.avg.rate = 0.01,
p.value = 0.5,
max.depth = NA,
monotonicity
)
Arguments
db |
Data frame of risk factors and target variable supplied for interaction extraction. |
rf |
Character vector of risk factor names on which decision tree is run. |
target |
Name of target variable (default indicator 0/1) within db argument. |
min.pct.obs |
Minimum percentage of observation in each leaf. Default is 0.05 or 30 observations. |
min.avg.rate |
Minimum percentage of defaults in each leaf. Default is 0.01 or 1 default case. |
p.value |
Significance level of test of two proportions for splitting criteria. Default is 0.05. |
max.depth |
Maximum tree depth. |
monotonicity |
Logical indicator. If |
Value
The command decision.tree
returns a object of class cdt. For details on output elements see the Examples.
See Also
Examples
suppressMessages(library(PDtoolkit))
data(loans)
#modify risk factors in order to show how the function works with missing values
loans$"Account Balance"[1:10] <- NA
loans$"Duration of Credit (month)"[c(13, 15)] <- NA
tree.res <- decision.tree(db = loans,
rf = c("Account Balance", "Duration of Credit (month)"),
target = "Creditability",
min.pct.obs = 0.05,
min.avg.rate = 0.01,
p.value = 0.05,
max.depth = NA,
monotonicity = TRUE)
str(tree.res)