R: Generate a Tablet for Data Frame

tablet.data.frame {tablet}

R Documentation

Generate a Tablet for Data Frame

Description

Generates a 'tablet': a summary table of formatted statistics for factors (is.factor()) and numerics (is.numeric()) in x, with and without grouping variables (if present, see group_by). Column names represent finest level of grouping, distinguished by attribute 'nest' (the values of higher other groups, if any) along with the 'all' column for ungrouped statistics. Column attribute 'n' indicates relevant corresponding observation count. Input should not have column names beginning with '_tablet'.

Usage

## S3 method for class 'data.frame'
tablet(
 x,
 ...,
 na.rm = FALSE,
 all = 'All',
 fun = list(
  sum ~ sum(x,  na.rm = TRUE),
  pct ~ signif(digits = 3,     sum / n * 100        ),
  ave ~ signif(digits = 3,    mean(x,  na.rm = TRUE)),
  std ~ signif(digits = 3,      sd(x,  na.rm = TRUE)),
  med ~ signif(digits = 3,  median(x,  na.rm = TRUE)),
  min ~ signif(digits = 3,     min(x,  na.rm = TRUE)),
  max ~ signif(digits = 3,     max(x,  na.rm = TRUE))
 ),
 fac = list(
  ` ` ~ sum + ' (' + pct + '%' + ')'
 ),
 num = list(
  `Mean (SD)` ~ ave + ' (' + std + ')',
  `Median (range)` ~ med + ' (' + min + ', ' + max + ')'
  ),
 lab = list(
  lab ~ name + '\n(N = ' + n + ')'
 ),
 na.rm_fac = na.rm,
 na.rm_num = na.rm,
 exclude_fac = NULL,
 exclude_name = NULL,
 all_levels = FALSE
)

Arguments

`x`	data.frame (possibly grouped)
`...`	substitute formulas for elements of fun, fac, num, lab
`na.rm`	whether to remove NA in general
`all`	a column name for ungrouped statistics; can have length zero to suppress ungrouped column
`fun`	default aggregate functions expressed as formulas
`fac`	a list of formulas to generate widgets for factors
`num`	a list of formulas to generate widgets for numerics
`lab`	a list of formulas to generate label attributes for columns (see details)
`na.rm_fac`	whether to drop NA 'factor' observations; passed to `gather` as na.rm, interacts with exclude_fac
`na.rm_num`	whether to drop NA numeric observations; passed to `gather` as na.rm
`exclude_fac`	which factor levels to exclude; see `factor` (exclude)
`exclude_name`	whether to drop NA values of column name (for completeness); passed to `gather`
`all_levels`	whether to supply records for unobserved levels

Details

Arguments 'fun', 'fac', 'num', and 'lab' are lists of two-sided formulas that are evaluated in an environment where '+' expresses concatenation (for character elements). The values of LHS should be unique across all four lists. 'fun' is a list of aggregate statistics that have access to N (number of original records), n (number of group members), and x (the numeric observations, or 1 for each factor value). Aggregate statistics generated by 'fun' are available for use in 'fac' and 'num' which create visualizations thereof ('widgets'). Column-specific attributes are available to elements of 'lab', including the special attribute name (the current column name). For 'lab' only, if the RHS succeeds, it becomes the label attribute of the corresponding output column. 'lab' is used here principally to support annotation of *output* columns; if *input* columns have attributes 'label' or 'title' (highest priority) those will have been already substituted for default column names at the appropriate positions in the output.

Missingness of observations (and to a lesser extent, levels of grouping variables) merits special consideration. Be aware that na.rm_fac and na.rm_num take their defaults from na.rm. Furthermore, na.rm_fac may interact with exclude_fac, which is passed to factor as exclude. The goal is to support all possible ways of expressing or ignoring missingness. That said, if aggregate functions are removing NA, the values of arguments beginning with 'na.rm' or 'exclude' may not matter.

Column 1 of output is character. Its values are typically the names of the original columns that were factor or numeric but not in groups(x). If the first of these had a label attribute or (priority) a title attribute with class 'latex', then column 1 is assigned the class 'latex' as well. It makes sense therefore to be consistent across input columns regarding the presence or not of a 'latex' label or title. By default, as_kable.tablet dispatches class-specific methods for escape_latex.

Similarly, row 1 of output is typically character. As of version 0.6.6, if any of the grouping variables inherits 'latex', then the return value of tablet.data.frame() has an attribute 'name_class' with value 'latex'.

Value

'tablet' A tablet is a special case of data.frame with grouped rows and columns.

`*`	There is always one level of row groups.
`*`	There can be any number of column groups, including zero.
`*`	All columns are character (as tested by `is.character()`).
`*`	The first column has empty strings that represent the last non-empty value. It can be class 'latex' or 'character'.
`*`	Leading element(s) of first column are deliberately blank (one space character) and correspond to header rows. See `header_rows`.
`*`	The second column represents group-specific property names. It is populated always and only where column 1 is not.
`*`	All other columns represent group-specific property values; elements before the first non-empty group value represent nested headers.
`*`	Header values may be repeated.
`*`	Header values may be empty strings, representing the last non-empty value to the left, or single spaces, which are deliberately blank.
`*`	Internally, character NA is equivalent to an empty string.

Examples

library(boot)
library(dplyr)
library(magrittr)
melanoma %>%
  select(-time, -year) %>%
  mutate(sex = factor(sex), ulcer = factor(ulcer)) %>%
  group_by(status) %>%
  tablet

[Package tablet version 0.6.8 Index]