R: Encode a given factor variable using dummy variables

encode_dummy {categoryEncodings}

R Documentation

Encode a given factor variable using dummy variables

Description

Transforms the original design matrix using a dummy variable encoding.

Usage

encode_dummy(X, fact, keep_factor = FALSE, encoding_only = FALSE)

Arguments

`X`	The data.frame/data.table to transform.
`fact`	The factor variable to encode by - either a positive integer specifying the column number, or the name of the column.
`keep_factor`	Whether to keep the original factor column(defaults to FALSE).
`encoding_only`	Whether to return the full transformed dataset or only the new columns. Defaults to FALSE and returns the full dataset.

Details

The basic dummy variable encoding, with reference class level set to 0. The reference class is always the first class observed.

Value

A new data.table X which contains the new columns and optionally the old factor.

Examples


design_mat <- cbind( data.frame( matrix(rnorm(5*100),ncol = 5) ),
                     sample( sample(letters, 10), 100, replace = TRUE)
                     )
colnames(design_mat)[6] <- "factor_var"

encode_dummy(X = design_mat, fact = "factor_var", keep_factor = FALSE)

[Package categoryEncodings version 1.4.3 Index]