ard_categorical {cards}R Documentation

Categorical ARD Statistics

Description

Compute Analysis Results Data (ARD) for categorical summary statistics.

Usage

ard_categorical(data, ...)

## S3 method for class 'data.frame'
ard_categorical(
  data,
  variables,
  by = dplyr::group_vars(data),
  strata = NULL,
  statistic = everything() ~ c("n", "p", "N"),
  denominator = NULL,
  fmt_fn = NULL,
  stat_label = everything() ~ default_stat_labels(),
  ...
)

Arguments

data

(data.frame)
a data frame

...

Arguments passed to methods.

variables

(tidy-select)
columns to include in summaries. Default is everything().

by, strata

(tidy-select)
columns to tabulate by/stratify by for tabulation. Arguments are similar, but with an important distinction:

by: results are tabulated by all combinations of the columns specified, including unobserved combinations and unobserved factor levels.

strata: results are tabulated by all observed combinations of the columns specified.

Arguments may be used in conjunction with one another.

statistic

(formula-list-selector)
a named list, a list of formulas, or a single formula where the list element one or more of c("n", "N", "p") (or the RHS of a formula).

denominator

(data.frame, integer)
Specify this optional argument to change the denominator, e.g. the "N" statistic. Default is NULL. See below for details.

fmt_fn

(formula-list-selector)
a named list, a list of formulas, or a single formula where the list element is a named list of functions (or the RHS of a formula), e.g. ⁠list(mpg = list(mean = \(x) round(x, digits = 2) |> as.character()))⁠.

stat_label

(formula-list-selector)
a named list, a list of formulas, or a single formula where the list element is either a named list or a list of formulas defining the statistic labels, e.g. everything() ~ list(n = "n", p = "pct") or everything() ~ list(n ~ "n", p ~ "pct").

Value

an ARD data frame of class 'card'

Denominators

By default, the ard_categorical() function returns the statistics "n", "N", and "p", where little "n" are the counts for the variable levels, and big "N" is the number of non-missing observations. The default calculation for the percentage is merely p = n/N.

However, it is sometimes necessary to provide a different "N" to use as the denominator in this calculation. For example, in a calculation of the rates of various observed adverse events, you may need to update the denominator to the number of enrolled subjects.

In such cases, use the denominator argument to specify a new definition of "N", and subsequently "p". The argument expects one of the following inputs:

Other Statistics

In some cases, you may need other kinds of statistics for categorical variables. Despite the name, ard_continuous() can be used to obtain these statistics.

In the example below, we calculate the mode of a categorical variable.

get_mode <- function(x) {
  table(x) |> sort(decreasing = TRUE) |> names() |> getElement(1L)
}

ADSL |>
  ard_continuous(
    variables = AGEGR1,
    statistic = list(AGEGR1 = list(mode = get_mode))
  )
#> {cards} data frame: 1 x 8
#>   variable   context stat_name stat_label  stat fmt_fn
#> 1   AGEGR1 continuo…      mode       mode 65-80   <fn>
#> i 2 more variables: warning, error

Examples

ard_categorical(ADSL, by = "ARM", variables = "AGEGR1")

ADSL |>
  dplyr::group_by(ARM) |>
  ard_categorical(
    variables = "AGEGR1",
    statistic = everything() ~ "n"
  )

[Package cards version 0.2.0 Index]