step_clean_names {textrecipes} | R Documentation |
Clean Variable Names
Description
step_clean_names()
creates a specification of a recipe step that will
clean variable names so the names consist only of letters, numbers, and the
underscore.
Usage
step_clean_names(
recipe,
...,
role = NA,
trained = FALSE,
clean = NULL,
skip = FALSE,
id = rand_id("clean_names")
)
Arguments
recipe |
A recipe object. The step will be added to the sequence of operations for this recipe. |
... |
One or more selector functions to choose which
variables are affected by the step. See |
role |
Not used by this step since no new variables are created. |
trained |
A logical to indicate if the quantities for preprocessing have been estimated. |
clean |
A named character vector to clean variable names. This is |
skip |
A logical. Should the step be skipped when the
recipe is baked by |
id |
A character string that is unique to this step to identify it. |
Value
An updated version of recipe
with the new step added
to the sequence of existing steps (if any).
Tidying
When you tidy()
this step, a tibble with columns terms
(the new clean variable names) and value
(the original variable names).
Case weights
The underlying operation does not allow for case weights.
See Also
step_clean_levels()
, recipes::step_factor2string()
,
recipes::step_string2factor()
, recipes::step_regex()
,
recipes::step_unknown()
, recipes::step_novel()
, recipes::step_other()
Other Steps for Text Cleaning:
step_clean_levels()
Examples
library(recipes)
data(airquality)
air_tr <- tibble(airquality[1:100, ])
air_te <- tibble(airquality[101:153, ])
rec <- recipe(~., data = air_tr)
rec <- rec %>%
step_clean_names(all_predictors())
rec <- prep(rec, training = air_tr)
tidy(rec, number = 1)
bake(rec, air_tr)
bake(rec, air_te)