add_lab_dummies {labelr} | R Documentation |
Add A Dummy Variable for Each Value Label
Description
For one or more value-labeled data.frame columns, create a dummy (aka indicator) variable for each unique value label.
Usage
add_lab_dummies(
data,
vars,
simple.names = TRUE,
sep = "_",
prefix.length = 4,
suffix.length = 7
)
ald(
data,
vars,
simple.names = TRUE,
sep = "_",
prefix.length = 4,
suffix.length = 7
)
Arguments
data |
a data.frame with at least one value-labeled variable (column). |
vars |
the value-labeled variable or variables from which dummy variable columns will be generated (variable names must be quoted). |
simple.names |
if TRUE (the default), dummy variable names will be the parent variable's name, followed by the sep separator (see above), followed by an automatically generated numerical id suffix. For example two dummy variable columns created from value-labeled column "tacos" using the sep argument of "." would be given the respective names "tacos.1" and "tacos.2"). |
sep |
the separator character to use in constructing dummy variable column names (appears between the dummy variable name prefix and suffix). |
prefix.length |
(NOTE: This argument is ignored if simple.names = TRUE). A 1L integer indicating the number of leading characters of the parent column's name to use in constructing dummy variable column names. For example, if simple.names = FALSE, if prefix.length = 2, and for a parent column named "tacos", each derivative dummy variable column name will begin with the prefix string "ta," (corresponding to the first two characters of "tacos"), followed by the sep separator character (see sep param, above), followed by the suffix string (see suffix.length param, below). |
suffix.length |
(NOTE: This argument is ignored if simple.names = TRUE). A 1L integer indicating the number of leading characters of each variable value label to use in constructing dummy variable column names. For example, consider the following setup: parent column name is "tacos"; prefix.length = 3; sep = "", and suffix.length = 2. In this case, if simple.names = FALSE, then a dummy variable column named "tac_so" would be created to represent those values of the tacos" column that have the value label "soft" (because "tac" are the first three letters of the parent column name, the separator is "", and "so" are the first two characters in "soft"). |
Details
If the default of simple.names is used, dummy variable column names will be the "parent" variable column name, followed by a separator character (by default, "_"), followed by a number, to differentiate each dummy variable from the others in the set. If one of the automatically generated dummy column names is already "taken" by a pre-existing data.frame column, an error to this effect will be thrown. If simple.names = FALSE, then prefix.length and suffix.length arguments will be used to construct dummy variable column names using the leading characters of the parent column name, followed by a separator character, followed by the leading characters of the value label. (white spaces in the value label will be replaced with the separator character).
Note: ald()
is an alias function that behaves identically to
add_lab_dummies
.
Value
A data.frame with dummy variables added for all value labels of the value-labeled columns supplied to the vars argument.
Examples
# one variable at a time, mtcars
df <- mtcars
# now, add 1-to-1 value labels
df <- add_val_labs(
data = df,
vars = "am",
vals = c(0, 1),
labs = c("automatic", "manual")
)
df <- add_val_labs(
data = df,
vars = "carb",
vals = c(1, 2, 3, 4, 6, 8),
labs = c(
"1-carb", "2-carbs",
"3-carbs", "4-carbs",
"6-carbs", "8-carbs"
)
)
# var arg can be unquoted if using add_val1()
# note that this is not add_val_labs(); add_val1() has "var" (not "vars) arg
df <- add_val1(
data = df,
var = cyl, # note, "var," not "vars" arg
vals = c(4, 6, 8),
labs = c(
"four-cyl",
"six-cyl",
"eight-cyl"
)
)
# add many-to-1 value labels
df <- add_m1_lab(
data = df,
vars = "gear",
vals = 4:5,
lab = "4+"
)
# add quartile-based numerical range value labels
df <- add_quant_labs(
data = df,
vars = "disp",
qtiles = 4
)
# add "pretty" cut-based numerical range value labels
(mpg_bins <- pretty(range(df$mpg, na.rm = TRUE)))
df <- add_quant_labs(data = df, vars = "mpg", vals = mpg_bins)
# add dummy variables for the labels of column "am"
df2 <- add_lab_dummies(df, "am",
sep = ".", simple.names = FALSE,
prefix.length = 2, suffix.length = 6
)
df2
# add dummy variables for the labels of columns "mpg", "gear", and "cyl
df3 <- add_lab_dummies(df, c("mpg", "gear", "cyl"), simple.names = TRUE) # default
df3