mdt.bin {monobin} | R Documentation |
Monotonic binning driven by decision tree
Description
mdt.bin
implements monotonic binning driven by decision tree. As a splitting metric for continuous target
algorithm uses sum of squared errors, while for the binary target Gini index is used.
Usage
mdt.bin(
x,
y,
g = 50,
sc = c(NA, NaN, Inf, -Inf),
sc.method = "together",
y.type = NA,
min.pct.obs = 0.05,
min.avg.rate = 0.01,
force.trend = NA
)
Arguments
x |
Numeric vector to be binned. |
y |
Numeric target vector (binary or continuous). |
g |
Number of splitting groups for each node. Default is 50. |
sc |
Numeric vector with special case elements. Default values are |
sc.method |
Define how special cases will be treated, all together or in separate bins.
Possible values are |
y.type |
Type of |
min.pct.obs |
Minimum percentage of observations per bin. Default is 0.05 or minimum 30 observations. |
min.avg.rate |
Minimum |
force.trend |
If the expected trend should be forced. Possible values: |
Value
The command mdt.bin
generates a list of two objects. The first object, data frame summary.tbl
presents a summary table of final binning, while x.trans
is a vector of discretized values.
In case of single unique value for x
or y
in complete cases (cases different than special cases),
it will return data frame with info.
Examples
suppressMessages(library(monobin))
data(gcd)
amt.bin <- mdt.bin(x = gcd$amount, y = gcd$qual)
amt.bin[[1]]
table(amt.bin[[2]])
#force decreasing trend
mdt.bin(x = gcd$amount, y = gcd$qual, force.trend = "d")[[1]]