tab_num {tabxplor}R Documentation

Means table

Description

Cross categorical variables with numeric variables, and get a table of means and standard deviations.

Usage

tab_num(
  data,
  row_var,
  col_vars,
  tab_vars,
  wt,
  diff = "tot",
  ci = NULL,
  conf_level = 0.95,
  comp = c("tab", "all"),
  color = c("auto", "diff", "diff_ci", "after_ci"),
  digits = 0,
  na = c("keep", "drop", "drop_fct", "drop_num"),
  totaltab = "line",
  totaltab_name = "Ensemble",
  tot = NULL,
  total_names = "Total",
  subtext = "",
  num = FALSE,
  df = FALSE
)

Arguments

data

A data frame.

row_var

The row variable, which will be printed with one level per line. If numeric, it will be used as a factor.

col_vars

The numeric variables, which will appear in columns : means and standard deviation are calculated for each levels of row_var and tab_vars.

tab_vars

<tidy-select> Tab variables : a subtable is made for each combination of levels of the selected variables. Leave empty to make a simple cross-table. All tab variables are converted to factor.

wt

A weight variable, of class numeric. Leave empty for unweighted results.

diff

The reference cell to calculate differences (used to print colors) :

  • "tot": by default, cells differences from total rows are calculated with pct = "row", and cells differences from total columns with pct = "col".

  • "first": calculate cells differences from the first cell of the row or column (useful to color temporal developments).

  • "no": not use diffs to gain calculation time.

ci

The type of confidence intervals to calculate, passed to tab_ci (automatically added if needed for color).

  • "cell": absolute confidence intervals of cells percentages.

  • "diff": confidence intervals of the difference between a cell and the relative total cell (or relative first cell when diff = "first").

  • "auto": ci = "diff" for means and row/col percentages, ci = "cell" for frequencies ("all", "all_tabs").

By default, for percentages, with ci = "cell" Wilson's method is used, and with ci = "diff" Wald's method along Agresti and Caffo's adjustment. Means use classic method. This can be changed in tab_ci.

conf_level

The confidence level for the confidence intervals, as a single numeric between 0 and 1. Default to 0.95 (95%).

comp

Comparison level. When tab_vars are present, should the contributions to variance be calculated for each subtable/group (by default, comp = "tab") ? Should they be calculated for the whole table (comp = "all") ? comp must be set once and for all the first time you use tab_plain, tab_num or tab_chi2 with rows, or tab_ci.

color

TRUE print the color percentages and means based on cells differences from totals or reference cell, as provided by diff. Default to FALSE, no colors.

digits

The number of digits to print, as a single integer.

na

The policy to adopt for missing values in row and tab variables (factors), as a single string.

  • "keep": by default, NA's of row and tab variables are printed as an explicit "NA" level.

  • "drop": remove NA's in row and tab variables.

NAs in numeric variables are always removed when calculating means. For that reason the n field of each resulting fmt column, used to calculate confidence intervals, only takes into account the complete observations (without NA). To drop all rows with NA in any numeric variable first, use tab_prepare or tab_many with the na_drop_all argument.

totaltab

The total table, if there are subtables/groups (i.e. when tab_vars is provided) :

  • "line": by default, add a general total line (necessary for calculations with comp = "all")

  • "table": add a complete total table (i.e. row_var by col_vars without tab_vars).

  • "no": not to draw any total table.

totaltab_name

The name of the total table, as a single string.

tot

The totals :

  • c("col", "row") or "both" : by default, both total rows and total columns.

  • "row": only total rows.

  • "col": only total column.

  • "no": remove all totals (after calculations if needed).

total_names

The names of the totals, as a character vector of length one or two. Use syntax of type c("Total row", "Total column") to set different names for rows and cols.

subtext

A character vector to print rows of legend under the table.

num

Set to TRUE to obtain a table with normal numeric vectors (not fmt).

df

Set to TRUE to obtain a plain data.frame (not a tibble), with normal numeric vectors (not fmt). Useful, for example, to pass the table to correspondence analysis with FactoMineR.

Value

A tibble of class tabxplor_tab. If ... (tab_vars) are provided, a tab of class tabxplor_grouped_tab. All non-text columns are fmt vectors of class tabxplor_fmt, storing all the data necessary to print formats and colors. Columns with row_var and tab_vars are of class factor : every added factor will be considered as a tab_vars and used for grouping. To add text columns without using them in calculations, be sure they are of class character.

Examples


data <- dplyr::storms %>% tab_prepare(category, wind, na_drop_all = wind)
tab_num(data, category, wind, tot = "row", color = "after_ci")


[Package tabxplor version 1.1.3 Index]