R: Get covariate summary dataframe

covsum {reportRmd}

R Documentation

Get covariate summary dataframe

Description

Returns a dataframe corresponding to a descriptive table.

Usage

covsum(
  data,
  covs,
  maincov = NULL,
  digits = 1,
  numobs = NULL,
  markup = TRUE,
  sanitize = TRUE,
  nicenames = TRUE,
  IQR = FALSE,
  all.stats = FALSE,
  pvalue = TRUE,
  effSize = FALSE,
  show.tests = FALSE,
  dropLevels = TRUE,
  excludeLevels = NULL,
  full = TRUE,
  digits.cat = 0,
  testcont = c("rank-sum test", "ANOVA"),
  testcat = c("Chi-squared", "Fisher"),
  include_missing = FALSE,
  percentage = c("column", "row")
)

Arguments

`data`	dataframe containing data
`covs`	character vector with the names of columns to include in table
`maincov`	covariate to stratify table by
`digits`	number of digits for summarizing mean data, does not affect p-values
`numobs`	named list overriding the number of people you expect to have the covariate
`markup`	boolean indicating if you want latex markup
`sanitize`	boolean indicating if you want to sanitize all strings to not break LaTeX
`nicenames`	boolean indicating if you want to replace . and _ in strings with a space
`IQR`	boolean indicating if you want to display the inter quantile range (Q1,Q3) as opposed to (min,max) in the summary for continuous variables
`all.stats`	boolean indicating if all summary statistics (Q1,Q3 + min,max on a separate line) should be displayed. Overrides IQR.
`pvalue`	boolean indicating if you want p-values included in the table
`effSize`	boolean indicating if you want effect sizes included in the table. Can only be obtained if pvalue is also requested. Effect sizes calculated include Cramer's V for categorical variables, Cohen's d, Wilcoxon r, or Eta-squared for numeric/continuous variables.
`show.tests`	boolean indicating if the type of statistical test and effect size used should be shown in a column beside the pvalues. Ignored if pvalue=FALSE.
`dropLevels`	logical, indicating if empty factor levels be dropped from the output, default is TRUE.
`excludeLevels`	a named list of covariate levels to exclude from statistical tests in the form list(varname =c('level1','level2')). These levels will be excluded from association tests, but not the table. This can be useful for levels where there is a logical skip (ie not missing, but not presented). Ignored if pvalue=FALSE.
`full`	boolean indicating if you want the full sample included in the table, ignored if maincov is NULL
`digits.cat`	number of digits for the proportions when summarizing categorical data (default: 0)
`testcont`	test of choice for continuous variables,one of rank-sum (default) or ANOVA
`testcat`	test of choice for categorical variables,one of Chi-squared (default) or Fisher
`include_missing`	Option to include NA values of maincov. NAs will not be included in statistical tests
`percentage`	choice of how percentages are presented ,one of column (default) or row

Details

Comparisons for categorical variables default to chi-square tests, but if there are counts of <5 then the Fisher Exact test will be used and if this is unsuccessful then a second attempt will be made computing p-values using MC simulation. If testcont='ANOVA' then the t-test with unequal variance will be used for two groups and an ANOVA will be used for three or more. The statistical test used can be displayed by specifying show.tests=TRUE.

The number of decimals places to display the statistics can be changed with digits, but this will not change the display of p-values. If more significant digits are required for p-values then use tableOnly=TRUE and format as desired.

References

Ellis, P.D. (2010) The essential guide to effect sizes: statistical power, meta-analysis, and the interpretation of research results. Cambridge: Cambridge University Press.doi:10.1017/CBO9780511761676