| ohe_commas {lares} | R Documentation |
One Hot Encoding for a Vector with Comma Separated Values
Description
This function lets the user do one hot encoding on a variable with comma separated values
Usage
ohe_commas(df, ..., sep = ",", noval = "NoVal", remove = FALSE)
Arguments
df |
Dataframe. May contain one or more columns with comma separated values which will be separated as one hot encoding |
... |
Variables. Which variables to split into new columns? |
sep |
Character. Which regular expression separates the elements? |
noval |
Character. No value text |
remove |
Boolean. Remove original variables? |
Value
data.frame on which all features are numerical by nature or transformed with one hot encoding.
See Also
Other Data Wrangling:
balance_data(),
categ_reducer(),
cleanText(),
date_cuts(),
date_feats(),
file_name(),
formatHTML(),
holidays(),
impute(),
left(),
normalize(),
num_abbr(),
ohse(),
quants(),
removenacols(),
replaceall(),
replacefactor(),
textFeats(),
textTokenizer(),
vector2text(),
year_month(),
zerovar()
Other One Hot Encoding:
date_feats(),
holidays(),
ohse()
Examples
df <- data.frame(
id = c(1:5),
x = c("AA, D", "AA,B", "B, D", "A,D,B", NA),
z = c("AA+BB+AA", "AA", "BB, AA", NA, "BB+AA")
)
ohe_commas(df, x, remove = TRUE)
ohe_commas(df, z, sep = "\\+")
ohe_commas(df, x, z)