ush.bin {PDtoolkit}R Documentation

U-shape binning algorithm

Description

ush.bin performs U-shape binning. All algorithms from monobin package are available. Due to specific nature of binning algorithms it is possible that for some selected knots algorithm will not be able to find U-shape. Therefore, users are encourage to inspect the results more into details and to try different binning algorithms.

Usage

ush.bin(
  x,
  y,
  knot,
  method,
  sc = c(NA, Inf, -Inf, NaN),
  sc.method = "together",
  g = 20,
  min.pct.obs = 0.05,
  min.avg.rate = 0.01,
  p.val = 0.05,
  woe.trend = TRUE,
  woe.gap = 0.1
)

Arguments

x

Numeric vector to be binned.

y

Numeric target vector (binary).

knot

Numeric value of selected knot. Usually the results of ush.test function.

method

Binning method. Available options are all from monobin package: "cum.bin", "iso.bin", "ndr.bin", "pct.bin", "sts.bin", "woe.bin", "mdt.bin".

sc

Numeric vector with special case elements. Default values are c(NA, NaN, Inf, -Inf). Recommendation is to keep the default values always and add new ones if needed. Otherwise, if these values exist in x and are not defined in the sc vector, function can report the error.

sc.method

Define how special cases will be treated, all together or separately. Possible values are "together", "separately".

g

Number of starting groups. Only needed for "cum.bin", "pct.bin" and mdt.bin methods. Default is 20.

min.pct.obs

Minimum percentage of observations per bin. Default is 0.05 or 30 observations.

min.avg.rate

Minimum y average rate. Default is 0.05 or 30 observations.

p.val

Threshold for p-value. Only needed for "sts.bin" and "ndr.bin" methods. Default is 0.05.

woe.trend

Logical. Only needed for "pct.bin" method with default TRUE.

woe.gap

Minimum WoE gap between bins. Only needed for "woe.bin" method with default of 0.1.

Value

The command ush.bin generates a list of two objects. The first object, data frame summary.tbl presents a summary table of final binning, while x.trans is a vector of discretized values.

Examples

res <- ush.bin(x = gcd$amount, y = gcd$qual, knot = 2992.579, method = "ndr.bin")
res[[1]]
plot(res[[1]]$dr, type = "l")

[Package PDtoolkit version 1.2.0 Index]