decision.tree {PDtoolkit}R Documentation

Custom decision tree algorithm

Description

decision.tree runs customized decision tree algorithm. Customization refers to minimum percentage of observations and defaults in each node, maximum tree depth, monotonicity condition at each splitting node and statistical test (test of two proportions) used for node splitting.

Usage

decision.tree(
  db,
  rf,
  target,
  min.pct.obs = 0.05,
  min.avg.rate = 0.01,
  p.value = 0.5,
  max.depth = NA,
  monotonicity
)

Arguments

db

Data frame of risk factors and target variable supplied for interaction extraction.

rf

Character vector of risk factor names on which decision tree is run.

target

Name of target variable (default indicator 0/1) within db argument.

min.pct.obs

Minimum percentage of observation in each leaf. Default is 0.05 or 30 observations.

min.avg.rate

Minimum percentage of defaults in each leaf. Default is 0.01 or 1 default case.

p.value

Significance level of test of two proportions for splitting criteria. Default is 0.05.

max.depth

Maximum tree depth.

monotonicity

Logical indicator. If TRUE, observed trend between risk factor and target will be preserved in splitting node.

Value

The command decision.tree returns a object of class cdt. For details on output elements see the Examples.

See Also

predict.cdt

Examples

suppressMessages(library(PDtoolkit))
data(loans)
#modify risk factors in order to show how the function works with missing values
loans$"Account Balance"[1:10] <- NA
loans$"Duration of Credit (month)"[c(13, 15)] <- NA
tree.res <- decision.tree(db = loans,
	rf = c("Account Balance", "Duration of Credit (month)"), 
	target = "Creditability",
	min.pct.obs = 0.05,
	min.avg.rate = 0.01,
	p.value = 0.05,
	max.depth = NA,
	monotonicity = TRUE)
str(tree.res)

[Package PDtoolkit version 1.2.0 Index]