set_col_as_factor {dataPreparation}R Documentation

Set columns as factor

Description

Set columns as factor and control number of unique element, to avoid having too large factors.

Usage

set_col_as_factor(data_set, cols = "auto", n_levels = 53, verbose = TRUE)

Arguments

data_set

Matrix, data.frame or data.table

cols

List of column(s) name(s) of data_set to transform into factor. To transform all columns set it to "auto", (characters, default to auto).

n_levels

Max number of levels for factor (integer, default to 53) set it to -1 to disable control.

verbose

Should the function log (logical, default to TRUE)

Details

Control number of levels will help you to distinguish true categorical columns from just characters that should be handled in another way.

Value

data_set(as a data.table), with specified columns set as factor or logical.

Examples

# Load messy_adult
data(tiny_messy_adult)

# we wil change education
tiny_messy_adult <- set_col_as_factor(tiny_messy_adult, cols = "education")

sapply(tiny_messy_adult[, .(education)], class)
# education is now a factor

[Package dataPreparation version 1.1.1 Index]