one_hot_encoder {dataPreparation}R Documentation

One hot encoder

Description

Transform factor column into 0/1 columns with one column per values of the column.

Usage

one_hot_encoder(
  data_set,
  encoding = NULL,
  type = "integer",
  verbose = TRUE,
  drop = FALSE
)

Arguments

data_set

Matrix, data.frame or data.table

encoding

Result of function build_encoding, (list, default to NULL).
To perform the same encoding on train and test, it is recommended to compute build_encoding before. If it is kept to NULL, build_encoding will be called.

type

What class of columns is expected? "integer" (0L/1L), "numeric" (0/1), or "logical" (TRUE/FALSE), (character, default to "integer")

verbose

Should the function log (logical, default to TRUE)

drop

Should cols be dropped after generation (logical, default to FALSE)

Details

If you don't want to edit your data set consider sending copy(data_set) as an input.
Please be careful using this function, it will generate as many columns as there different values in your column and might use a lot of RAM. To be safe, you can use parameter min_frequency in build_encoding.

Value

data_set edited by reference with new columns.

Examples

data(tiny_messy_adult)

# Compute encoding
encoding <- build_encoding(tiny_messy_adult, cols = c("marital", "occupation"), verbose = TRUE)

# Apply it
tiny_messy_adult <- one_hot_encoder(tiny_messy_adult, encoding = encoding, drop = TRUE)

# Apply same encoding to adult
data(adult)
adult <- one_hot_encoder(adult, encoding = encoding, drop = TRUE)

# To have encoding as logical (TRUE/FALSE), pass it in type argument
data(adult)
adult <- one_hot_encoder(adult, encoding = encoding, type = "logical", drop = TRUE)

[Package dataPreparation version 1.1.1 Index]