Measures of Shape {DescTools} | R Documentation |
Skewness and Kurtosis
Description
Skew
computes the skewness, Kurt
the excess kurtosis of the values in x.
Usage
Skew(x, weights = NULL, na.rm = FALSE, method = 3, conf.level = NA,
ci.type = "bca", R = 1000, ...)
Kurt(x, weights = NULL, na.rm = FALSE, method = 3, conf.level = NA,
ci.type = "bca", R = 1000, ...)
Arguments
x |
a numeric vector. An object which is not a vector is coerced (if possible) by |
weights |
a numerical vector of weights the same length as |
na.rm |
logical, indicating whether |
method |
integer out of 1, 2 or 3 (default). See Details. |
conf.level |
confidence level of the interval. If set to |
ci.type |
The type of confidence interval required. The value should be any subset
of the values |
R |
The number of bootstrap replicates. Usually this will be a single positive integer. For importance resampling,
some resamples may use one set of weights and others use a different set of weights. In this case |
... |
the dots are passed to the function |
Details
Kurt()
returns the excess kurtosis, therefore the kurtosis calculates as Kurt(x) + 3
if required.
If na.rm
is TRUE
then missing values are removed before computation proceeds.
The methods for calculating the skewness can either be:
method = 1: g_1 = m_3 / m_2^(3/2)
method = 2: G_1 = g_1 * sqrt(n(n-1)) / (n-2)
method = 3: b_1 = m_3 / s^3 = g_1 ((n-1)/n)^(3/2)
and the ones for the kurtosis:
method = 1: g_2 = m_4 / m_2^2 - 3
method = 2: G_2 = ((n+1) g_2 + 6) * (n-1) / ((n-2)(n-3))
method = 3: b_2 = m_4 / s^4 - 3 = (g_2 + 3) (1 - 1/n)^2 - 3
method = 1 is the typical definition used in Stata and in many older textbooks.
method = 2 is used in SAS and SPSS.
method = 3 is used in MINITAB and BMDP.
Cramer et al. (1997) mention the asymptotic standard error of the skewness, resp. kurtosis:
ASE.skew = sqrt( 6n(n-1)/((n-2)(n+1)(n+3)) ) ASE.kurt = sqrt( (n^2 - 1)/((n-3)(n+5)) )
to be used for calculating the confidence intervals. This is implemented here with ci.type="classic"
. However, Joanes and Gill (1998) advise against this approach, pointing out that the normal assumptions would virtually always be violated.
They suggest using the bootstrap method. That's why the default method for the confidence interval type is set to "bca"
.
This implementation of the two functions is comparably fast, as the expensive sums are coded in C.
Value
If conf.level
is set to NA
then the result will be
a |
single numeric value |
and
if a conf.level
is provided, a named numeric vector with 3 elements:
skew , kurt |
the specific estimate, either skewness or kurtosis |
lwr.ci |
lower bound of the confidence interval |
upr.ci |
upper bound of the confidence interval |
Author(s)
Andri Signorell <andri@signorell.net>, David Meyer <david.meyer@r-project.org> (method = 3)
References
Cramer, D. (1997): Basic Statistics for Social Research Routledge.
Joanes, D. N., Gill, C. A. (1998): Comparing measures of sample skewness and kurtosis. The Statistician, 47, 183-189.
See Also
mean
, sd
, similar code in library(e1071)
Examples
Skew(d.pizza$price, na.rm=TRUE)
Kurt(d.pizza$price, na.rm=TRUE)
# use sapply to calculate skewness for a data.frame
sapply(d.pizza[,c("temperature","price","delivery_min")], Skew, na.rm=TRUE)
# or apply to do that columnwise with a matrix
apply(as.matrix(d.pizza[,c("temperature","price","delivery_min")]), 2, Skew, na.rm=TRUE)