one_hot {mltools} | R Documentation |
One Hot Encode
Description
One-Hot-Encode unordered factor columns of a data.table
Usage
one_hot(dt, cols = "auto", sparsifyNAs = FALSE, naCols = FALSE,
dropCols = TRUE, dropUnusedLevels = FALSE)
Arguments
dt |
A data.table |
cols |
Which column(s) should be one-hot-encoded? DEFAULT = "auto" encodes all unordered factor columns |
sparsifyNAs |
Should NAs be converted to 0s? |
naCols |
Should columns be generated to indicate the present of NAs? Will only apply to factor columns with at least one NA |
dropCols |
Should the resulting data.table exclude the original columns which are one-hot-encoded? |
dropUnusedLevels |
Should columns of all 0s be generated for unused factor levels? |
Details
One-hot-encoding converts an unordered categorical vector (i.e. a factor) to multiple binarized vectors where each binary vector of 1s and 0s indicates the presence of a class (i.e. level) of the of the original vector.
Examples
library(data.table)
dt <- data.table(
ID = 1:4,
color = factor(c("red", NA, "blue", "blue"), levels=c("blue", "green", "red"))
)
one_hot(dt)
one_hot(dt, sparsifyNAs=TRUE)
one_hot(dt, naCols=TRUE)
one_hot(dt, dropCols=FALSE)
one_hot(dt, dropUnusedLevels=TRUE)
[Package mltools version 0.3.5 Index]