sumtable {vtable} | R Documentation |
Summary Table Function
Description
This function will output a summary statistics variable table either to the console or as an HTML file that can be viewed continuously while working with data, or sent to file for use elsewhere. st()
is the same thing but requires fewer key presses to type.
Usage
sumtable(
data,
vars = NA,
out = NA,
file = NA,
summ = NA,
summ.names = NA,
add.median = FALSE,
group = NA,
group.long = FALSE,
group.test = FALSE,
group.weights = NA,
col.breaks = NA,
digits = 2,
fixed.digits = FALSE,
numformat = formatfunc(digits = digits, big.mark = ""),
skip.format = c("notNA(x)", "propNA(x)", "countNA(x)", obs.function),
factor.percent = TRUE,
factor.counts = TRUE,
factor.numeric = FALSE,
logical.numeric = FALSE,
logical.labels = c("No", "Yes"),
labels = NA,
title = "Summary Statistics",
note = NA,
anchor = NA,
col.width = NA,
col.align = NA,
align = NA,
note.align = "l",
fit.page = "\\textwidth",
simple.kable = FALSE,
obs.function = NA,
opts = list()
)
st(
data,
vars = NA,
out = NA,
file = NA,
summ = NA,
summ.names = NA,
add.median = FALSE,
group = NA,
group.long = FALSE,
group.test = FALSE,
group.weights = NA,
col.breaks = NA,
digits = 2,
fixed.digits = FALSE,
numformat = formatfunc(digits = digits, big.mark = ""),
skip.format = c("notNA(x)", "propNA(x)", "countNA(x)", obs.function),
factor.percent = TRUE,
factor.counts = TRUE,
factor.numeric = FALSE,
logical.numeric = FALSE,
logical.labels = c("No", "Yes"),
labels = NA,
title = "Summary Statistics",
note = NA,
anchor = NA,
col.width = NA,
col.align = NA,
align = NA,
note.align = "l",
fit.page = "\\textwidth",
simple.kable = FALSE,
obs.function = NA,
opts = list()
)
Arguments
data |
Data set; accepts any format with column names. |
vars |
Character vector of column names to include, in the order you'd like them included. Defaults to all numeric, factor, and logical variables, plus any character variables with six or fewer unique values. You can include strings that aren't columns in the data (including blanks) - these will create rows that are blank except for the string (left-aligned), for spacers or subtitles. |
out |
Determines where the completed table is sent. Set to |
file |
Saves the completed summary table file to file with this filepath. May be combined with any value of |
summ |
Character vector of summary statistics to include for numeric and logical variables, in the form |
summ.names |
Character vector of names for the summary statistics included. If |
add.median |
Adds |
group |
Character variable with the name of a column in the data set that statistics are to be calculated over. Value labels will be used if found for numeric variables. Changes the default |
group.long |
By default, if |
group.test |
Set to |
group.weights |
THIS OPTION DOES NOT AUTOMATICALLY WEIGHT ALL CALCULATIONS. This is mostly to be used with |
col.breaks |
Numeric vector indicating the variables (or number of elements of |
digits |
Number of digits after the decimal place to report. Set to a single number for consistent digits, or a vector the same length as |
fixed.digits |
Deprecated; currently only works if |
numformat |
A function that takes a numeric input and produces labeled output, which you might construct using the |
skip.format |
Set of functions in |
factor.percent |
Set to |
factor.counts |
Set to |
factor.numeric |
By default, factor variable dummies basically ignore the |
logical.numeric |
By default, logical variables are treated as factors with |
logical.labels |
When turning logicals into factors, use these labels for |
labels |
Variable labels. labels will accept four formats: (1) A vector of the same length as the number of variables in the data that will be included in the table (tricky to use if many are being dropped, also won't work for your |
title |
Character variable with the title of the table. |
note |
Table note to go after the last row of the table. Will follow significance star note if |
anchor |
Character variable to be used to set an anchor link in HTML tables, or a label tag in LaTeX. |
col.width |
Vector of page-width percentages, on 0-100 scale, overriding default column widths in an HTML table. Must have a number of elements equal to the number of columns in the resulting table. |
col.align |
For HTML output, a character vector indicating the HTML |
align |
For LaTeX output, string indicating the alignment of each column. Use standard LaTeX syntax (i.e. |
note.align |
For LaTeX output, set the alignment for the multi-column table note. Usually "l", but if you have a long note in LaTeX you might want to set it with "p" |
fit.page |
For LaTeX output, uses a resizebox to force the table to a certain width. Set to |
simple.kable |
For |
obs.function |
The function to use (and, potentially, format) to count the number of observations for the N column. This should take a vector and return a single number or string. Uses the same string formatting as |
opts |
The same |
Details
There are many, many functions in R that will produce a summary statisics table for you. So why use sumtable()
? sumtable()
serves two main purposes:
(1) In the same spirit as vtable()
, it makes it easy to view the summary statistics as you work, either in the Viewer pane or in a browser window.
(2) sumtable()
is designed to have nice defaults and is not really intended for deep customization. It's got lots of options, sure, but they're only intended to go so far. So you can have a summary statistics table without much work.
Keeping with point (2), sumtable()
is designed for use by people who want the kind of table that sumtable()
produces, which is itself heavily influenced by the kinds of summary statistics tables you often see in economics papers. In that regard it is most similar to stargazer::stargazer()
except that it can handle tibbles, factor variables, grouping, and produce multicolumn tables, or summarytools::dfSummary()
or skimr::skim()
except that it is easier to export with nice formatting. If you want a lot of control over your summary statistics table, check out the packages gtsummary, arsenal, qwraps2, or Amisc, and about a million more.
If you would like to include a sumtable
in an RMarkdown document, it should just work! If you leave out
blank, it will default to a nicely-formatted knitr::kable()
, although this will drop some formatting elements like multi-column cells (or do out="kable"
to get an unformatted kable
that you can format yourself). If you prefer the vtable
package formatting, then use out="latex"
if outputting to LaTeX or out="htmlreturn"
for HTML, both with results="asis"
in the code chunk. Alternately, in HTML, you can use the file
option to write to file and use a <iframe>
to include it.
Examples
# Examples are only run interactively because they open HTML pages in Viewer or a browser.
if (interactive()) {
data(iris)
# Sumtable handles both numeric and factor variables
st(iris)
# Output to LaTeX as well for easy integration
# with RMarkdown, or \input{} into your LaTeX docs
# (specify file too to save the result)
st(iris, out = 'latex')
# Summary statistics by group
iris$SL.above.median <- iris$Sepal.Length > median(iris$Sepal.Length)
st(iris, group = 'SL.above.median')
# Add a group test, or report by-group in "long" format
st(iris, group = 'SL.above.median', group.test = TRUE)
st(iris, group = 'SL.above.median', group.long = TRUE)
# Going all out! Adding variable labels with labels,
# spacers and variable "category" titles with vars,
# Changing the presentation of the factor variable,
# and putting the factor in its own column with col.breaks
var.labs <- data.frame(var = c('SL.above.median','Sepal.Length',
'Sepal.Width','Petal.Length',
'Petal.Width'),
labels = c('Above-median Sepal Length','Sepal Length',
'Sepal Width','Petal Length',
'Petal Width'))
st(iris,
labels = var.labs,
vars = c('Sepal Variables','SL.above.median','Sepal.Length','Sepal.Width',
'Petal Variables','Petal.Length','Petal.Width',
'Species'),
factor.percent = FALSE,
col.breaks = 7)
# Format the results
# use rep so there are enough observations to see the comma separators
irisrep = do.call('rbind', replicate(100, iris, simplify = FALSE))
# Comma separator for thousands, including for N.
st(irisrep, numformat = 'comma')
# Dollar formatting for sepal.width, decimal (1.000,00) formatting for the rest
st(iris, numformat = c('decimal','Sepal.Width' = '$'))
# Custom formatting throughout, note the big.mark = ',' will also be picked up by N
st(irisrep, numformat = formatfunc(digits = 2, nsmall = 2, big.mark = ','))
}