dCor {GiniDistance} | R Documentation |
Distance Covariance and Correlation Statistics
Description
Computes distance covariance and correlation statistics, in which Xs are quantitative and Ys are categorical and return the measures of dependence.
Usage
dCor(x, y, alpha)
Arguments
x |
data |
y |
label of data or univariate response variable |
alpha |
exponent on Euclidean distance, in (0,2] |
Details
The sample size (number of rows) of the data must agree with the length of the label vector, and samples must not contain missing values. Arguments
x
, y
are treated as data and labels.
dCor
calls dcor
function from energy package which computes the distance correlation between X and Y where both are numerical variables. If Y is categorical, the set difference metric on the support of is used. That is,
where
is the indicator function. Then the sample distance correlation between data and labels is computed as follows.
Let be a symmetric,
, centered distance matrix of sample
. The
-th entry of
is
if
and 0 if
,
where
,
,
, and
. Similarly, using the set difference metric, a symmetric,
, centered distance matrix is calculated for samples
and denoted by
. Unbiased estimators of
,
and
are given respectively as,
,
and
. Then the distance correlation is
Value
dCor
returns the sample distance variance of x
, distance variance of y
, distance covariance of x
and y
and distance correlation of x
, y
.
References
Lyons, R. (2013). Distance covariance in metric spaces. The Annals of Probability, 41 (5), 3284-3305.
Szekely, G. J., Rizzo, M. L. and Bakirov, N. (2007). Measuring and testing dependence by correlation of distances. Annals of Statistics, 35 (6), 2769-2794.
Rizzo, M.L. and Szekely, G.J., (2017). Energy: E-Statistics: Multivariate Inference via the Energy of Data (R Package), Version 1.7-0.
See Also
Examples
x <- iris[,1:4]
y <- unclass(iris[,5])
dCor(x, y, alpha = 1)