bivariate_compare {describedata} | R Documentation |
Create publication-style table across one categorical variable
Description
Descriptive statistics for categorical variables as well as normally and non-normally distributed continuous variables, split across levels of a categorical variable. Depending on the variable type, an appropriate statistical test is used to assess differences across levels of the comparison variable.
Usage
bivariate_compare(df, compare, normal_vars = NULL,
non_normal_vars = NULL, cat_vars = NULL, display_round = 2,
p = TRUE, p_round = 4, include_na = FALSE, col_n = TRUE,
cont_n = FALSE, all_cont_mean = FALSE, all_cont_median = FALSE,
iqr = TRUE, fisher = FALSE, workspace = NULL, var_order = NULL,
var_label_df = NULL)
Arguments
df |
A data.frame or tibble. |
compare |
Discrete variable. Separate statistics will be produced for each level, with statistical tests across levels. Must be quoted. |
normal_vars |
Character vector of normally distributed continuous variables that will be included in the descriptive table. |
non_normal_vars |
Character vector of non-normally distributed continuous variables that will be included in the descriptive table. |
cat_vars |
Character vector of categorical variables that will be included in the descriptive table. |
display_round |
Number of decimal places displayed values should be rounded to |
p |
Logical. Should p-values be calculated and displayed?
Default |
p_round |
Number of decimal places p-values should be rounded to. |
include_na |
Logical. Should |
col_n |
Logical. Should the total number of observations be displayed
for each column? Default |
cont_n |
Logical. Display sample n for continuous variables in the
table. Default |
all_cont_mean |
Logical. Display mean (sd) for all continuous variables.
Default |
all_cont_median |
Logical. Display median (sd) for all continuous variables.
Default |
iqr |
Logical. If the median is displayed for a continuous variable, should
interquartile range be displayed as well ( |
fisher |
Logical. Should Fisher's exact test be used for categorical
variables? Default |
workspace |
Numeric variable indicating the workspace to be used for
Fisher's exact test. If |
var_order |
Character vector listing the variable names in the order
results should be displayed. If |
var_label_df |
A data.frame or tibble with columns "variable" and
"label" that contains display labels for each variable specified in
|
Details
Statistical differences between normally distributed continuous variables
are assessed using aov()
, differences in non-normally distributed
variables are assessed using kruskal.test()
, and differences in
categorical variables are assessed using chisq.test()
by default,
with a user option for fisher.test()
instead.
Value
A data.frame with columns label, overall, a column for each level
of compare
, and p.value. For normal_vars
, mean (SD) is
displayed, for non_normal_vars
median (IQR) is displayed, and for
cat_vars
n (percent) is displayed. For p values on continuous
variables, a superscript 'a' denotes the Kruskal-Wallis test was used
Examples
bivariate_compare(iris, compare = "Species", normal_vars = c("Sepal.Length", "Sepal.Width"))
bivariate_compare(mtcars, compare = "cyl", non_normal_vars = "mpg")