tbl_summary {gtsummary} | R Documentation |
Summary table
Description
The tbl_summary()
function calculates descriptive statistics for
continuous, categorical, and dichotomous variables.
Review the
tbl_summary vignette
for detailed examples.
Usage
tbl_summary(
data,
by = NULL,
label = NULL,
statistic = list(all_continuous() ~ "{median} ({p25}, {p75})", all_categorical() ~
"{n} ({p}%)"),
digits = NULL,
type = NULL,
value = NULL,
missing = c("ifany", "no", "always"),
missing_text = "Unknown",
missing_stat = "{N_miss}",
sort = all_categorical(FALSE) ~ "alphanumeric",
percent = c("column", "row", "cell"),
include = everything()
)
Arguments
data |
( |
by |
( |
label |
( |
statistic |
( |
digits |
( |
type |
( |
value |
( |
missing , missing_text , missing_stat |
Arguments dictating how and if missing values are presented:
|
sort |
( |
percent |
( |
include |
( |
Value
a gtsummary table of class "tbl_summary"
A table of class c('tbl_summary', 'gtsummary')
statistic argument
The statistic argument specifies the statistics presented in the table. The
input dictates the summary statistics presented in the table. For example,
statistic = list(age ~ "{mean} ({sd})")
would report the mean and
standard deviation for age; statistic = list(all_continuous() ~ "{mean} ({sd})")
would report the mean and standard deviation for all continuous variables.
The values are interpreted using glue::glue()
syntax:
a name that appears between curly brackets will be interpreted as a function
name and the formatted result of that function will be placed in the table.
For categorical variables, the following statistics are available to display:
{n}
(frequency), {N}
(denominator), {p}
(percent).
For continuous variables, any univariate function may be used.
The most commonly used functions are {median}
, {mean}
, {sd}
, {min}
,
and {max}
.
Additionally, {p##}
is available for percentiles, where ##
is an integer from 0 to 100.
For example, p25: quantile(probs=0.25, type=2)
.
When the summary type is "continuous2"
, pass a vector of statistics.
Each element of the vector will result in a separate row in the summary table.
For both categorical and continuous variables, statistics on the number of missing and non-missing observations and their proportions are available to display.
-
{N_obs}
total number of observations -
{N_miss}
number of missing observations -
{N_nonmiss}
number of non-missing observations -
{p_miss}
percentage of observations missing -
{p_nonmiss}
percentage of observations not missing
digits argument
The digits argument specifies the the number of digits (or formatting function) statistics are rounded to.
The values passed can either be a single integer, a vector of integers, a
function, or a list of functions. If a single integer or function is passed,
it is recycled to the length of the number of statistics presented.
For example, if the statistic is "{mean} ({sd})"
, it is equivalent to
pass 1
, c(1, 1)
, label_style_number(digits=1)
, and
list(label_style_number(digits=1), label_style_number(digits=1))
.
Named lists are also accepted to change the default formatting for a single
statistic, e.g. list(sd = label_style_number(digits=1))
.
type and value arguments
There are four summary types. Use the type
argument to change the default summary types.
-
"continuous"
summaries are shown on a single row. Most numeric variables default to summary type continuous. -
"continuous2"
summaries are shown on 2 or more rows -
"categorical"
multi-line summaries of nominal data. Character variables, factor variables, and numeric variables with fewer than 10 unique levels default to type categorical. To change a numeric variable to continuous that defaulted to categorical, usetype = list(varname ~ "continuous")
-
"dichotomous"
categorical variables that are displayed on a single row, rather than one row per level of the variable. Variables coded asTRUE
/FALSE
,0
/1
, oryes
/no
are assumed to be dichotomous, and theTRUE
,1
, andyes
rows are displayed. Otherwise, the value to display must be specified in thevalue
argument, e.g.value = list(varname ~ "level to show")
Author(s)
Daniel D. Sjoberg
See Also
See tbl_summary vignette for detailed tutorial
See table gallery for additional examples
Review list, formula, and selector syntax used throughout gtsummary
Examples
# Example 1 ----------------------------------
trial |>
select(age, grade, response) |>
tbl_summary()
# Example 2 ----------------------------------
trial |>
select(age, grade, response, trt) |>
tbl_summary(
by = trt,
label = list(age = "Patient Age"),
statistic = list(all_continuous() ~ "{mean} ({sd})"),
digits = list(age = c(0, 1))
)
# Example 3 ----------------------------------
trial |>
select(age, marker) |>
tbl_summary(
type = all_continuous() ~ "continuous2",
statistic = all_continuous() ~ c("{median} ({p25}, {p75})", "{min}, {max}"),
missing = "no"
)