pct.bin {monobin} | R Documentation |
Monotonic binning based on percentiles
Description
pct.bin
implements percentile-based monotonic binning by the iterative discretization.
Usage
pct.bin(
x,
y,
sc = c(NA, NaN, Inf, -Inf),
sc.method = "together",
g = 15,
y.type = NA,
woe.trend = TRUE,
force.trend = NA
)
Arguments
x |
Numeric vector to be binned. |
y |
Numeric target vector (binary or continuous). |
sc |
Numeric vector with special case elements. Default values are |
sc.method |
Define how special cases will be treated, all together or in separate bins.
Possible values are |
g |
Number of starting groups. Default is 15. |
y.type |
Type of |
woe.trend |
Applied only for a continuous target ( |
force.trend |
If the expected trend should be forced. Possible values: |
Value
The command pct.bin
generates a list of two objects. The first object, data frame summary.tbl
presents a summary table of final binning, while x.trans
is a vector of discretized values.
In case of single unique value for x
or y
of complete cases (cases different than special cases),
it will return data frame with info.
Examples
suppressMessages(library(monobin))
data(gcd)
#binary target
mat.bin <- pct.bin(x = gcd$maturity, y = gcd$qual)
mat.bin[[1]]
table(mat.bin[[2]])
#continuous target, separate groups for special cases
set.seed(123)
gcd$age.d <- gcd$age
gcd$age.d[sample(1:nrow(gcd), 10)] <- NA
gcd$age.d[sample(1:nrow(gcd), 3)] <- 9999999999
age.d.bin <- pct.bin(x = gcd$age.d,
y = gcd$qual,
sc = c(NA, NaN, Inf, -Inf, 9999999999),
sc.method = "separately",
force.trend = "d")
age.d.bin[[1]]
gcd$age.d.bin <- age.d.bin[[2]]
gcd %>% group_by(age.d.bin) %>% summarise(n = n(), y.avg = mean(qual))