freqs_by {quest} | R Documentation |
Multiple Univariate Frequency Tables
Description
freqs_by
creates a frequency table for a set of variables in a
data.frame by group. Depending on total
, frequencies for all the
variables together can be returned by group. The function probably makes the
most sense for sets of variables with similar unique values (e.g., items from
a questionnaire with similar response options).
Usage
freqs_by(
data,
vrb.nm,
grp.nm,
prop = FALSE,
useNA = "always",
total = "no",
sep = "."
)
Arguments
data |
data.fame of data. |
vrb.nm |
character vector of colnames from |
grp.nm |
character vector of colnames from |
prop |
logical vector of length 1 specifying whether the frequencies
should be counts (FALSE) or proportions (TRUE). Note, whether the
proportions include missing values depends on the |
useNA |
character vector of length 1 specifying how missing values
should be handled. The three options are 1) "no" = do not include NA
frequencies in the return object, 2) "ifany" = only NA frequencies if there
are any missing values (in any variable from |
total |
character vector of length 1 specifying whether the frequencies
for the set of variables as a whole should be returned. The name "total"
refers to tabulating the frequencies for the variables from
|
sep |
character vector of length 1 specifying the string to combine the
group values together with. |
Details
freqs_by
uses plyr::rbind.fill
to combine the results from
table
applied to each variable into a single data.frame for each
group. If a variable from data[vrb.nm]
for each group does not have
values present in other variables from data[vrb.nm]
for that group,
then the frequencies in the return object will be 0.
The name for the table element giving the frequency of missing values is
"(NA)". This is different from table
where the name is
NA_character_
. This change allows for the sorting of tables that
include missing values, as subsetting in R is not possible with
NA_character_
names. In future versions of the package, this might
change as it should be possible to avoid this issue by subetting with a
logical vector or integer indices instead of names. However, it is convenient
to be able to subset the return object fully by names.
Value
list of data.frames containing the frequencies for the variables in
data[vrb.nm]
by group. The number of list elements are the groups
specified by unique(interaction(data[grp.nm], sep = sep))
. Depending
on prop
, the frequencies are either counts (FALSE) or proportions
(TRUE) by group. Depending on total
, the nrow for each data.frame is
either 1) length(vrb.nm)
(if total
= "no"), 1 +
length(vrb.nm)
(if total
= "yes"), or 3) 1 (if total
=
"only"). The rownames are vrb.nm
for each variable in
data[vrb.nm]
and "_total_" for the total row (if present). The
colnames for each data.frame are the unique values present in
data[vrb.nm]
, potentially including "(NA)" depending on
useNA
.
See Also
Examples
vrb_nm <- str2str::inbtw(names(psych::bfi), "A1","O5")
freqs_by(data = psych::bfi, vrb.nm = vrb_nm, grp.nm = "gender") # default
freqs_by(data = psych::bfi, vrb.nm = vrb_nm, grp.nm = "gender",
prop = TRUE) # proportions by row
freqs_by(data = psych::bfi, vrb.nm = vrb_nm, grp.nm = "gender",
useNA = "no") # without NA counts
freqs_by(data = psych::bfi, vrb.nm = vrb_nm, grp.nm = "gender",
total = "yes") # include total counts
freqs_by(data = psych::bfi, vrb.nm = vrb_nm,
grp.nm = c("gender","education")) # multiple grouping variables