R: Multiple Univariate Frequency Tables

freqs_by {quest}

R Documentation

Multiple Univariate Frequency Tables

Description

freqs_by creates a frequency table for a set of variables in a data.frame by group. Depending on total, frequencies for all the variables together can be returned by group. The function probably makes the most sense for sets of variables with similar unique values (e.g., items from a questionnaire with similar response options).

Usage

freqs_by(
  data,
  vrb.nm,
  grp.nm,
  prop = FALSE,
  useNA = "always",
  total = "no",
  sep = "."
)

Arguments

`data`	data.fame of data.
`vrb.nm`	character vector of colnames from `data` specifying the variables.
`grp.nm`	character vector of colnames from `data` specifying the groups.
`prop`	logical vector of length 1 specifying whether the frequencies should be counts (FALSE) or proportions (TRUE). Note, whether the proportions include missing values depends on the `useNA` argument.
`useNA`	character vector of length 1 specifying how missing values should be handled. The three options are 1) "no" = do not include NA frequencies in the return object, 2) "ifany" = only NA frequencies if there are any missing values (in any variable from `data[vrb.nm]`), or 3) "always" = do include NA frequencies regardless of whether there are missing values or not.
`total`	character vector of length 1 specifying whether the frequencies for the set of variables as a whole should be returned. The name "total" refers to tabulating the frequencies for the variables from `data[vrb.nm]` together as a set. The three options are 1) "no" = do not include a row for the total frequencies in the return object, 2) "yes" = do include the total frequencies as the first row in the return object, or 3) "only" = only include the total frequencies as a single row in the return object and do not include rows for each of the individual column frequencies in `data[vrb.nm]`.
`sep`	character vector of length 1 specifying the string to combine the group values together with. `sep` is only used if there are multiple grouping variables (i.e., `length(grp.nm)` > 1).

Details

freqs_by uses plyr::rbind.fill to combine the results from table applied to each variable into a single data.frame for each group. If a variable from data[vrb.nm] for each group does not have values present in other variables from data[vrb.nm] for that group, then the frequencies in the return object will be 0.

The name for the table element giving the frequency of missing values is "(NA)". This is different from table where the name is NA_character_. This change allows for the sorting of tables that include missing values, as subsetting in R is not possible with NA_character_ names. In future versions of the package, this might change as it should be possible to avoid this issue by subetting with a logical vector or integer indices instead of names. However, it is convenient to be able to subset the return object fully by names.

Value

list of data.frames containing the frequencies for the variables in data[vrb.nm] by group. The number of list elements are the groups specified by unique(interaction(data[grp.nm], sep = sep)). Depending on prop, the frequencies are either counts (FALSE) or proportions (TRUE) by group. Depending on total, the nrow for each data.frame is either 1) length(vrb.nm) (if total = "no"), 1 + length(vrb.nm) (if total = "yes"), or 3) 1 (if total = "only"). The rownames are vrb.nm for each variable in data[vrb.nm] and "_total_" for the total row (if present). The colnames for each data.frame are the unique values present in data[vrb.nm], potentially including "(NA)" depending on useNA.

Examples

vrb_nm <- str2str::inbtw(names(psych::bfi), "A1","O5")
freqs_by(data = psych::bfi, vrb.nm = vrb_nm, grp.nm = "gender") # default
freqs_by(data = psych::bfi, vrb.nm = vrb_nm, grp.nm = "gender",
   prop = TRUE) # proportions by row
freqs_by(data = psych::bfi, vrb.nm = vrb_nm, grp.nm = "gender",
   useNA = "no") # without NA counts
freqs_by(data = psych::bfi, vrb.nm = vrb_nm, grp.nm = "gender",
   total = "yes") # include total counts
freqs_by(data = psych::bfi, vrb.nm = vrb_nm,
   grp.nm = c("gender","education")) # multiple grouping variables

[Package quest version 0.2.0 Index]