R: Custom decision tree algorithm

decision.tree {PDtoolkit}

R Documentation

Custom decision tree algorithm

Description

decision.tree runs customized decision tree algorithm. Customization refers to minimum percentage of observations and defaults in each node, maximum tree depth, monotonicity condition at each splitting node and statistical test (test of two proportions) used for node splitting.

Usage

decision.tree(
  db,
  rf,
  target,
  min.pct.obs = 0.05,
  min.avg.rate = 0.01,
  p.value = 0.5,
  max.depth = NA,
  monotonicity
)

Arguments

`db`	Data frame of risk factors and target variable supplied for interaction extraction.
`rf`	Character vector of risk factor names on which decision tree is run.
`target`	Name of target variable (default indicator 0/1) within db argument.
`min.pct.obs`	Minimum percentage of observation in each leaf. Default is 0.05 or 30 observations.
`min.avg.rate`	Minimum percentage of defaults in each leaf. Default is 0.01 or 1 default case.
`p.value`	Significance level of test of two proportions for splitting criteria. Default is 0.05.
`max.depth`	Maximum tree depth.
`monotonicity`	Logical indicator. If `TRUE`, observed trend between risk factor and target will be preserved in splitting node.

Value

The command decision.tree returns a object of class cdt. For details on output elements see the Examples.

Examples

suppressMessages(library(PDtoolkit))
data(loans)
#modify risk factors in order to show how the function works with missing values
loans$"Account Balance"[1:10] <- NA
loans$"Duration of Credit (month)"[c(13, 15)] <- NA
tree.res <- decision.tree(db = loans,
	rf = c("Account Balance", "Duration of Credit (month)"), 
	target = "Creditability",
	min.pct.obs = 0.05,
	min.avg.rate = 0.01,
	p.value = 0.05,
	max.depth = NA,
	monotonicity = TRUE)
str(tree.res)