encode_cats {eHDPrep}R Documentation

Encode categorical variables using one-hot encoding.

Description

Variables specified in ... are replaced with new variables describing the presence of each unique category. Generated variable names have space characters replaced with "_" and commas are removed.

Usage

encode_cats(data, ...)

Arguments

data

A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr).

...

<tidy-select> One or more unquoted expressions separated by commas. Variable names can be used as if they were positions in the data frame, so expressions like x:y can be used to select a range of variables.

Value

Tibble with converted variables.

Examples

require(magrittr)
require(dplyr)

data(example_data)

# encode one variable
encode_cats(example_data, marital_status) %>%
select(starts_with("marital_status"))

# encode multiple variables
encoded <- encode_cats(example_data, diabetes, marital_status)

select(encoded, starts_with("marital_status"))
# diabetes_type included below but was not modified:
select(encoded, starts_with("diabetes")) 

[Package eHDPrep version 1.3.3 Index]