summaryStats {s20x} | R Documentation |
Summary Statistics
Description
Produces a table of summary statistics for the data. If the argument
group
is missing, calculates a matrix of summary statistics for the
data in x
. If group
is present, the elements of group
are interpreted as group labels and the summary statistics are displayed for
each group separately.
Usage
summaryStats(x, ...)
## Default S3 method:
summaryStats(
x,
group = rep("Data", length(x)),
data.order = TRUE,
digits = 2,
...
)
## S3 method for class 'formula'
summaryStats(x, data = NULL, data.order = TRUE, digits = 2, ...)
## S3 method for class 'matrix'
summaryStats(x, data.order = TRUE, digits = 2, ...)
Arguments
x |
either a single vector of values, or a formula of the form data~group, or a matrix. |
... |
Optional arguments which are passed to the summary statistic functions.
For example |
group |
a vector of group labels. |
data.order |
if |
digits |
the number of decimal places to display. |
data |
an optional data frame containing the variables in the model. |
Value
If x
is a single variable, i.e. there are no groups, then a
single list is invisibly returned with the following named items:
min |
Minimum value. |
max |
Maximum value. |
mean |
Mean value. |
var |
Variance – the average of the squares of the deviations of the data values from the sample mean. |
sd |
Standard deviation – the square root of the variance. |
n |
Number of data values – size of the data set. |
nMissing |
If there are missing values, and |
iqr |
Midspread (IQR) – the range spanned by central half of data; the interquartile range. |
skewness |
Skewness statistic – indicates how skewed the data set is. Positive values indicate right-skew data. Negative values indicate left-skew data. |
lq |
Lower quartile |
median |
Median – the middle value when the batch is ordered. |
uq |
Upper quartile |
If grouping is provided, either by using the
group
argument, or providing a factor in a formula, or by passing a
matrix where the different columns represent the groups, then the function
will return a data.frame
a row containing all the statistics above
for each group.
Methods (by class)
-
summaryStats(default)
: Summary Statistics -
summaryStats(formula)
: Summary Statistics -
summaryStats(matrix)
: Summary Statistics
Examples
## STATS20x data:
data(course.df)
## Single variable summary
with(course.df, summaryStats(Exam))
## Using a formula
summaryStats(Exam ~ Stage1, course.df)
## Using a matrix
X = cbind(rnorm(50), rnorm(50))
summaryStats(X)
## Saving and extracting the information
sumStats = summaryStats(Exam ~ Degree, course.df)
sumStats
## Just the BAs
sumStats['BA', ]
## Just the means
sumStats$mean