fast_discretization {dataPreparation}R Documentation

Discretization

Description

Discretization of numeric variable (either equal_width or equal_fred).

Usage

fast_discretization(data_set, bins = NULL, verbose = TRUE)

Arguments

data_set

Matrix, data.frame or data.table

bins

Result of function build_bins, (list, default to NULL).
To perform the same discretization on train and test, it is recommended to compute build_bins before. If it is kept to NULL, build_bins will be called.
bins could also be carefully hand written.

verbose

Should the algorithm talk? (Logical, default to TRUE)

Details

NAs will be putted in an NA category.

Value

Same dataset discretized by reference.
If you don't want to edit by reference please provide set data_set = copy(data_set).

Examples

# Load data
data(tiny_messy_adult)
head(tiny_messy_adult)

# Compute bins
bins <- build_bins(tiny_messy_adult, cols = "auto", n_bins = 5, type = "equal_freq")

# Discretize
tiny_messy_adult <- fast_discretization(tiny_messy_adult, bins = bins)

# Control
head(tiny_messy_adult)

# Example with hand written bins
data("adult")
adult <-  fast_discretization(adult, bins = list(age = c(0, 40, +Inf)))
print(table(adult$age))

[Package dataPreparation version 1.1.1 Index]