check_all_recode {tntpr} | R Documentation |
Process a range of check-all-that-apply response columns for correct tabulation.
Description
Some survey software returns check-all-that-apply response columns where missing values could indicate either that the respondent skipped the question entirely, or that they did not select that particular answer choice. To count the responses properly, the cases where a respondent did not check any of choices - i.e., they skipped the question - should not be counted in the denominator (assuming that the choices were completely exhaustive, or that there was an NA option).
This function takes a data.frame and range of columns containing all answer choices to a check-all-that-apply question and updates the columns in the data.frame to contain one of three values: 1 if the choice was selected; 0 if the respondent chose another option but not this one; or NA if the respondent skipped the question (i.e., they did not select any of the choices) and thus their response is truly missing.
It also takes the single text values in each column and adds them as a label
attribute to each data.frame columns.
This function accomodates an open-response column, to get the correct denominator when some respondents have skipped all check variables but written something in. This passing over of the offered choices is an implicit rejection of them, not a "missing." Such a text variable will throw a warning - which may be okay - and will then be recoded into a binary 1/0 variable indicating a response. Such a text variable will be assigned the label "Other". Consider preserving the original respondent text values prior to this point as a separate column if needed.
check_all_recode()
prepares the data.frame for a call to its sister function check_all_count()
. The label attribute is accessed by this function.
Usage
check_all_recode(dat, ..., set_labels = TRUE)
Arguments
dat |
a data.frame with survey data |
... |
unquoted variable names containing the answer choices. Can be specified as a range, i.e., |
set_labels |
should the label attribute of the columns be over-written with the column text? Allow this to be TRUE unless there are currently label attributes you don't wish to overwrite. |
Value
the original data.frame with the specified column range updated, and with label attributes on the questions.
Examples
x <- data.frame( # 4th person didn't respond at all
unrelated = 1:5,
q1_1 = c("a", "a", "a", NA, NA),
q1_2 = c("b", "b", NA, NA, NA),
q1_3 = c(NA, NA, "c", NA, NA),
q1_other = c(NA, "something else", NA, NA, "not any of these")
)
library(dplyr) # for the %>% pipe
x %>%
check_all_recode(q1_1:q1_other)
# You can use any of the dplyr::select() helpers to identify the columns:
x %>%
check_all_recode(contains("q1"))