binning_rgr {dlookr}R Documentation

Binning by recursive information gain ratio maximization

Description

The binning_rgr() finding intervals for numerical variable using recursive information gain ratio maximization.

Usage

binning_rgr(.data, y, x, min_perc_bins = 0.1, max_n_bins = 5, ordered = TRUE)

Arguments

.data

a data frame.

y

character. name of binary response variable. The variable must character of factor.

x

character. name of continuous characteristic variable. At least 5 different values. and Inf is not allowed.

min_perc_bins

numeric. minimum percetange of rows for each split or segment (controls the sample size), 0.1 (or 10 percent) as default.

max_n_bins

integer. maximum number of bins or segments to split the input variable, 5 bins as default.

ordered

logical. whether to build an ordered factor or not.

Details

This function can be usefully used when developing a model that predicts y.

Value

an object of "infogain_bins" class. Attributes of "infogain_bins" class is as follows.

See Also

binning, binning_by, plot.infogain_bins.

Examples


library(dplyr)

# binning by recursive information gain ratio maximization using character
bin <- binning_rgr(heartfailure, "death_event", "creatinine")

# binning by recursive information gain ratio maximization using name
bin <- binning_rgr(heartfailure, death_event, creatinine)
bin

# summary optimal_bins class
summary(bin)

# visualize all information for optimal_bins class
plot(bin)

# visualize WoE information for optimal_bins class
plot(bin, type = "cross")

# visualize all information without typographic
plot(bin, type = "cross", typographic = FALSE)

# extract binned results
extract(bin) %>% 
  head(20)



[Package dlookr version 0.6.3 Index]