e_mcv {GFDmcv} | R Documentation |
Estimators and confidence intervals of four multivariate coefficients of variation and their reciprocals
Description
Calculates the estimators with respective (1-\alpha)
-confidence intervals for the four different variants of the multivariate coefficients (MCV) and their reciprocals
by Reyment (1960), Van Valen (1974), Voinov and Nikulin (1996) and Albert and Zhang (2010).
Usage
e_mcv(x, conf_level = 0.95)
Arguments
x |
a matrix of data of size |
conf_level |
a confidence level. By default, it is equal to 0.95. |
Details
The function e_mcv()
calculates four different variants of multivariate coefficient of variation for d
-dimensional data. These variant were introduced by
by Reyment (1960, RR), Van Valen (1974, VV), Voinov and Nikulin (1996, VN) and Albert and Zhang (2010, AZ):
{\widehat C}^{RR}=\sqrt{\frac{(\det\mathbf{\widehat\Sigma})^{1/d}}{\boldsymbol{\widehat\mu}^{\top}\boldsymbol{\widehat\mu}}},\
{\widehat C}^{VV}=\sqrt{\frac{\mathrm{tr}\mathbf{\widehat\Sigma}}{\boldsymbol{\widehat\mu}^{\top}\boldsymbol{\widehat\mu}}},\
{\widehat C}^{VN}=\sqrt{\frac{1}{\boldsymbol{\widehat\mu}^{\top}\mathbf{\widehat\Sigma}^{-1}\boldsymbol{\widehat\mu}}},\
{\widehat C}^{AZ}=\sqrt{\frac{\boldsymbol{\widehat\mu}^{\top}\mathbf{\widehat\Sigma}\boldsymbol{\widehat\mu}}{(\boldsymbol{\widehat\mu}^{\top}\boldsymbol{\widehat\mu})^2}},
where n
is the sample size, \boldsymbol{\widehat\mu}
is the empirical mean vector and \mathbf{\widehat \Sigma}
is the empirical covariance matrix:
\boldsymbol{\widehat\mu}_i = \frac{1}{n}\sum_{j=1}^{n} \mathbf{X}_{j},\; \mathbf{\widehat \Sigma} =\frac{1}{n}\sum_{j=1}^{n} (\mathbf{X}_{j} - \boldsymbol{\widehat \mu})(\mathbf{X}_{j} - \boldsymbol{\widehat \mu})^{\top}.
In the univariate case (d=1
), all four variants reduce to coefficient of variation. Furthermore, their reciprocals, the so-called standardized means, are determined:
{\widehat B}^{RR}=\sqrt{\frac{\boldsymbol{\widehat\mu}^{\top}\boldsymbol{\widehat\mu}}{(\det\mathbf{\widehat\Sigma})^{1/d}}},\
{\widehat B}^{VV}=\sqrt{\frac{\boldsymbol{\widehat\mu}^{\top}\boldsymbol{\widehat\mu}}{\mathrm{tr}\mathbf{\widehat\Sigma}}},\
{\widehat B}^{VN}=\sqrt{\boldsymbol{\widehat\mu}^{\top}\mathbf{\widehat\Sigma}^{-1}\boldsymbol{\widehat\mu}},\
{\widehat B}^{AZ}=\sqrt{\frac{(\boldsymbol{\widehat\mu}^{\top}\boldsymbol{\widehat\mu})^2}{\boldsymbol{\widehat\mu}^{\top}\mathbf{\widehat\Sigma}\boldsymbol{\widehat\mu}}}.
In addition to the estimators, the respective confidence intervals [C_lwr
, C_upr
] for a given confidence level 1-\alpha
are calculated by the e_mcv()
function.
These confidence intervals are based on an asymptotic approximation by a normal distribution, see Ditzhaus and Smaga (2023) for the technical details. These approximations
do not rely on any specific (semi-)parametric assumption on the distribution and are valid nonparametrically, even for tied data.
Value
When d>1
(respectively d=1
) a data frame with four rows (one row) corresponding to the four MCVs (the univariate CV)
and six columns containing the estimators C_est
for the MCV (CV) and the estimators B_est
for their reciprocals as well as the upper and lower bounds of the corresponding
confidence intervals [C_lwr
, C_upr
] and [B_lwr
, B_upr
].
References
Albert A., Zhang L. (2010) A novel definition of the multivariate coefficient of variation. Biometrical Journal 52:667-675.
Ditzhaus M., Smaga L. (2023) Inference for all variants of the multivariate coefficient of variation in factorial designs. Preprint https://arxiv.org/abs/2301.12009.
Reyment R.A. (1960) Studies on Nigerian Upper Cretaceous and Lower Tertiary Ostracoda: part 1. Senonian and Maastrichtian Ostracoda, Stockholm Contributions in Geology, vol 7.
Van Valen L. (1974) Multivariate structural statistics in natural history. Journal of Theoretical Biology 45:235-247.
Voinov V., Nikulin M. (1996) Unbiased Estimators and Their Applications, Vol. 2, Multivariate Case. Kluwer, Dordrecht.
Examples
# d > 1 (MCVs)
data_set <- lapply(list(iris[iris$Species == "setosa", 1:3],
iris[iris$Species == "versicolor", 1:3],
iris[iris$Species == "virginica", 1:3]),
as.matrix)
lapply(data_set, e_mcv)
# d = 1 (CV)
data_set <- lapply(list(iris[iris$Species == "setosa", 1],
iris[iris$Species == "versicolor", 1],
iris[iris$Species == "virginica", 1]),
as.matrix)
lapply(data_set, e_mcv)