data_summary {datawizard}R Documentation

Summarize data

Description

This function can be used to compute summary statistics for a data frame or a matrix.

Usage

data_summary(x, ...)

## S3 method for class 'data.frame'
data_summary(x, ..., by = NULL, include_na = TRUE)

Arguments

x

A (grouped) data frame.

...

One or more named expressions that define the new variable name and the function to compute the summary statistic. Example: mean_sepal_width = mean(Sepal.Width). The expression can also be provided as a character string, e.g. "mean_sepal_width = mean(Sepal.Width)". The summary function n() can be used to count the number of observations.

by

Optional character string, indicating the name of a variable in x. If supplied, the data will be split by this variable and summary statistics will be computed for each group.

include_na

Logical. If TRUE, missing values are included as a level in the grouping variable. If FALSE, missing values are omitted from the grouping variable.

Value

A data frame with the requested summary statistics.

Examples

data(iris)
data_summary(iris, MW = mean(Sepal.Width), SD = sd(Sepal.Width))
data_summary(
  iris,
  MW = mean(Sepal.Width),
  SD = sd(Sepal.Width),
  by = "Species"
)

# same as
d <- data_group(iris, "Species")
data_summary(d, MW = mean(Sepal.Width), SD = sd(Sepal.Width))

# multiple groups
data(mtcars)
data_summary(mtcars, MW = mean(mpg), SD = sd(mpg), by = c("am", "gear"))

# expressions can also be supplied as character strings
data_summary(mtcars, "MW = mean(mpg)", "SD = sd(mpg)", by = c("am", "gear"))

# count observations within groups
data_summary(mtcars, observations = n(), by = c("am", "gear"))

# first and last observations of "mpg" within groups
data_summary(
  mtcars,
  first = mpg[1],
  last = mpg[length(mpg)],
  by = c("am", "gear")
)

[Package datawizard version 0.10.0 Index]