| summaryStats {s20x} | R Documentation |
Summary Statistics
Description
Produces a table of summary statistics for the data. If the argument
group is missing, calculates a matrix of summary statistics for the
data in x. If group is present, the elements of group
are interpreted as group labels and the summary statistics are displayed for
each group separately.
Usage
summaryStats(x, ...)
## Default S3 method:
summaryStats(
x,
group = rep("Data", length(x)),
data.order = TRUE,
digits = 2,
...
)
## S3 method for class 'formula'
summaryStats(x, data = NULL, data.order = TRUE, digits = 2, ...)
## S3 method for class 'matrix'
summaryStats(x, data.order = TRUE, digits = 2, ...)
Arguments
x |
either a single vector of values, or a formula of the form data~group, or a matrix. |
... |
Optional arguments which are passed to the summary statistic functions.
For example |
group |
a vector of group labels. |
data.order |
if |
digits |
the number of decimal places to display. |
data |
an optional data frame containing the variables in the model. |
Value
If x is a single variable, i.e. there are no groups, then a
single list is invisibly returned with the following named items:
min |
Minimum value. |
max |
Maximum value. |
mean |
Mean value. |
var |
Variance – the average of the squares of the deviations of the data values from the sample mean. |
sd |
Standard deviation – the square root of the variance. |
n |
Number of data values – size of the data set. |
nMissing |
If there are missing values, and |
iqr |
Midspread (IQR) – the range spanned by central half of data; the interquartile range. |
skewness |
Skewness statistic – indicates how skewed the data set is. Positive values indicate right-skew data. Negative values indicate left-skew data. |
lq |
Lower quartile |
median |
Median – the middle value when the batch is ordered. |
uq |
Upper quartile |
If grouping is provided, either by using the
group argument, or providing a factor in a formula, or by passing a
matrix where the different columns represent the groups, then the function
will return a data.frame a row containing all the statistics above
for each group.
Methods (by class)
-
summaryStats(default): Summary Statistics -
summaryStats(formula): Summary Statistics -
summaryStats(matrix): Summary Statistics
Examples
## STATS20x data:
data(course.df)
## Single variable summary
with(course.df, summaryStats(Exam))
## Using a formula
summaryStats(Exam ~ Stage1, course.df)
## Using a matrix
X = cbind(rnorm(50), rnorm(50))
summaryStats(X)
## Saving and extracting the information
sumStats = summaryStats(Exam ~ Degree, course.df)
sumStats
## Just the BAs
sumStats['BA', ]
## Just the means
sumStats$mean