freq_by {quest}R Documentation

Univariate Frequency Table By Group

Description

tables_by creates a frequency table for a set of variables in a data.frame by group. Depending on total, frequencies for all the variables together can be returned by group. The function probably makes the most sense for sets of variables with similar unique values (e.g., items from a questionnaire with similar response options).

Usage

freq_by(
  x,
  grp,
  exclude = if (useNA == "no") c(NA, NaN),
  useNA = "always",
  prop = FALSE,
  sort = "frequency",
  decreasing = TRUE,
  na.last = TRUE
)

Arguments

x

atomic vector.

grp

atomic vector or list of atomic vectors (e.g., data.frame) specifying the groups. The atomic vector(s) must be the length of x or else an error is returned.

exclude

unique values of x to exclude from the returned table. If NULL, then missing values are always included in the returned table. See table for documentation on the same argument.

useNA

character vector of length 1 specifying how to handle missing values (i.e., whether to include NA as an element in the returned table). There are three options: 1) "no" = don't include missing values in the table, 2) "ifany" = include missing values if there are any, 3) "always" = include missing values in the table, regardless of whether there are any or not. See table for documentation on the same argument.

prop

logical vector of length 1 specifying whether the returned table should include counts (FALSE) or proportions (TRUE). If NAs are excluded (e.g., useNA = "no" or exclude = c(NA, NaN)), then the proportions will be based on the number of observed elements.

sort

character vector of length 1 specifying how the returned table will be sorted. There are three options: 1) "frequency" = the frequency of the unique values in x, 2) "position" = the position when each unique value first appears in x, 3) "alphanum" = alphanumeric ordering of the unique values in x (the sorting used by table). When "frequency" is specified and there are ties, then the ties are sorted alphanumerically.

decreasing

logical vector of length 1 specifying whether the table should be sorted in decreasing (TRUE) or increasing (FALSE) order.

na.last

logical vector of length 1 specifying whether the table should have NAs last or in whatever position they end up at. This argument is only relevant if NAs exist in x and are included in the table (e.g., useNA = "always" or exclude = NULL).

Details

tables_by uses plyr::rbind.fill to combine the results from table applied to each variable into a single data.frame for each group. If a variable from data[vrb.nm] for each group does not have values present in other variables from data[vrb.nm] for that group, then the frequencies in the return object will be 0.

The name for the table element giving the frequency of missing values is "(NA)". This is different from table where the name is NA_character_. This change allows for the sorting of tables that include missing values, as subsetting in R is not possible with NA_character_ names. In future versions of the package, this might change as it should be possible to avoid this issue by subetting with a logical vector or integer indices instead of names. However, it is convenient to be able to subset the return object fully by names.

Value

list of numeric vector of frequencies by group. The number of list elements are the groups specified by unique(interaction(grp, sep = sep)). The frequencies either counts (if prop = FALSE) or proportions (if prop = TRUE) with the unique values of x as names (missing values have name = "(NA)"). Note, this is different from table, which returns a 1D-array and has class "table".

See Also

freq freq_by freqs_by table

Examples

x <- freq_by(mtcars$"gear", grp = mtcars$"vs")
str(x)
y <- freq_by(mtcars$"am", grp = mtcars$"vs", useNA = "no")
str(y)
str2str::lv2m(lapply(X = y, FUN = rev), along = 1) # ready to pass to prop.test()

[Package quest version 0.2.0 Index]