dum2nom {quest} | R Documentation |
Dummy Variables to a Nominal Variable
Description
dum2nom
converts dummy variables to a nominal variable. The
information from the dummy columns in a data.frame are combined into a
character vector (or factor if rtn.fct
= TRUE) representing a nominal
variable. The unique values of the nominal variable will be the dummy
colnames (i.e., dum.nm
). Note, *all* the dummy variables associated
with a nominal variable are required for this function to work properly. In
regression-like models, data analysts will exclude one dummy variable for the
category that is the reference group. If d = number of categories in the
nominal variable, then that leads to d - 1 dummy variables in the model.
dum2nom
requires all d dummy variables.
Usage
dum2nom(data, dum.nm, yes = 1L, rtn.fct = FALSE)
Arguments
data |
data.frame of data. |
dum.nm |
character vector of colnames from |
yes |
atomic vector of length 1 specifying the unique value of the category in each dummy column. This must be the same value for all the dummy variables. |
rtn.fct |
logical vector of length 1 specifying whether the return object should be a factor (TRUE) or a character vector (FALSE). |
Details
dum2nom
tests to ensure that data[dum.nm]
are indeed a set of
dummy columns. First, the dummy columns are expected to have the same mode
such that there is one yes
unique value across the dummy columns.
Second, each row in data[dum.nm]
is expected to have either 0 or 1
instance of yes
. If there is more than one instance of yes
in a
row, then an error is returned. If there is 0 instances of yes
in a
row (e.g., all missing values), NA is returned for that row. Note, any value
other than yes
will be treated as a no.
Value
character vector (or factor if rtn.fct
= TRUE) containing the
unique values of dum.nm
- one for each dummy variable.
See Also
Examples
dum <- data.frame(
"Quebec_nonchilled" = ifelse(CO2$"Type" == "Quebec" & CO2$"Treatment" == "nonchilled",
yes = 1L, no = 0L),
"Quebec_chilled" = ifelse(CO2$"Type" == "Quebec" & CO2$"Treatment" == "chilled",
yes = 1L, no = 0L),
"Mississippi_nonchilled" = ifelse(CO2$"Type" == "Mississippi" & CO2$"Treatment" == "nonchilled",
yes = 1L, no = 0L),
"Mississippi_chilled" = ifelse(CO2$"Type" == "Mississippi" & CO2$"Treatment" == "chilled",
yes = 1L, no = 0L)
)
dum2nom(data = dum, dum.nm = names(dum)) # default
dum2nom(data = dum, dum.nm = names(dum), rtn.fct = TRUE) # return as a factor
## Not run:
dum2nom(data = npk, dum.nm = c("N","P","K")) # error due to overlapping dummy columns
dum2nom(data = mtcars, dum.nm = c("vs","am"))# error due to overlapping dummy columns
## End(Not run)