gmd {GiniDistance} | R Documentation |
Gini Mean Difference
Description
Computes Gini mean difference of x, where alpha is an exponent on the Euclidean distance and return the Gini mean difference. The default value for alpha is 1.
Usage
gmd(x, alpha)
Arguments
x |
data |
alpha |
exponent on Euclidean distance, in (0,2) |
Details
gmd
compute Gini mean difference of data.
It is a self-contained R function dealing with both univariate and multivariate data.
The samples must not contain missing values. alpha
if missing by default is 1, otherwise it is exponent on the Euclidean distance.
Gini mean difference (GMD) was originally introduced as an alternative measure of variability to the usual standard deviation (Gini14, Yitzhaki13). Let X
and X^\prime
be independent random variables from a univariate distribution F
with finite first moment in R
. The GMD of F
is
\Delta=\Delta(X)=\Delta(F)=E|X-X^{\prime}|,
the expected distance between two independent random variables. If the sample data \mathbf x=\{x_1,x_2,...,x_n\}
is available, the sample Gini mean difference is calculated by
\hat{\Delta} = {n \choose 2}^{-1} \sum_{1\leq i<j\leq n} | x_i - x_j| = {n \choose 2}^{-1} \sum_{i=1}^n (2i-n-1) x_{(i)},
where x_{(1)} \leq x_{(2)} \leq \cdots \leq x_{(n)}
are the order statistics of \mathbf x
(Schechtman87). The computation complexity for univariate Gini Mean difference is O(n \log n)
.
Gini mean difference has been generalized for multivariate distributions (Koshvoy97) That is, the Gini mean difference of a distribution F in \mathbf{R}^d
is \Delta =E \|\mathbf X -\mathbf X ^\prime\|,
or even more generally for some \alpha \in (0,2)
,
\Delta(\alpha) = E \|\mathbf X-\mathbf X^\prime\|^{\alpha}
,
where \| \mathbf x \|
is the Euclidean norm. The sample Gini mean difference is computed by
\hat{\Delta(\alpha)} = {n \choose 2}^{-1} \sum_{1\leq i<j\leq n} \| x_i - x_j\|^{\alpha}.
Its computation complexity is O(n^2)
.
Value
gmd
returns the sample Gini mean distance.
References
Gini, C. (1914). Sulla misura della concentrazione e della variabilita dei caratteri. Atti del Reale Istituto Veneto di Scienze, Lettere ed Aeti, 62, 1203-1248. English Translation: On the measurement of concentration and variability of characters (2005). Metron, LXIII(1), 3-38.
Koshevoy, G. and Mosler, K. (1997). Multivariate Gini indices. Journal of Multivariate Analysis, 60, 252-276.
Schechtman, E. and Yitzhaki, S. (1987). A measure of association based on Gini's mean difference. Communication in Statistics-Theory and Methods, 16 (1), 207-231.
Yitzhaki, S. and Schechtman, E. (2013). The Gini Methodology, Springer, New York.
See Also
Examples
n = 100
x <- runif(n)
t0 = proc.time()
gmd(x, alpha=1)
proc.time()- t0
t1 = proc.time()
gmd(x, alpha=0.5)
proc.time()- t1
x <- matrix(runif(n), n/2, 2)
gmd(x,alpha=1)