dcov.test {energy} | R Documentation |

Distance covariance test and distance correlation test of multivariate independence. Distance covariance and distance correlation are multivariate measures of dependence.

```
dcov.test(x, y, index = 1.0, R = NULL)
dcor.test(x, y, index = 1.0, R)
```

`x` |
data or distances of first sample |

`y` |
data or distances of second sample |

`R` |
number of replicates |

`index` |
exponent on Euclidean distance, in (0,2] |

`dcov.test`

and `dcor.test`

are nonparametric
tests of multivariate independence. The test decision is
obtained via permutation bootstrap, with `R`

replicates.

The sample sizes (number of rows) of the two samples must
agree, and samples must not contain missing values. Arguments
`x`

, `y`

can optionally be `dist`

objects;
otherwise these arguments are treated as data.

The `dcov`

test statistic is
`n \mathcal V_n^2`

where
`\mathcal V_n(x,y)`

= dcov(x,y),
which is based on interpoint Euclidean distances
`\|x_{i}-x_{j}\|`

. The `index`

is an optional exponent on Euclidean distance.

Similarly, the `dcor`

test statistic is based on the normalized
coefficient, the distance correlation. (See the manual page for `dcor`

.)

Distance correlation is a new measure of dependence between random
vectors introduced by Szekely, Rizzo, and Bakirov (2007).
For all distributions with finite first moments, distance
correlation `\mathcal R`

generalizes the idea of correlation in two
fundamental ways:

(1) `\mathcal R(X,Y)`

is defined for `X`

and `Y`

in arbitrary dimension.

(2) `\mathcal R(X,Y)=0`

characterizes independence of `X`

and
`Y`

.

Characterization (2) also holds for powers of Euclidean distance `\|x_i-x_j\|^s`

, where `0<s<2`

, but (2) does not hold when `s=2`

.

Distance correlation satisfies `0 \le \mathcal R \le 1`

, and
`\mathcal R = 0`

only if `X`

and `Y`

are independent. Distance
covariance `\mathcal V`

provides a new approach to the problem of
testing the joint independence of random vectors. The formal
definitions of the population coefficients `\mathcal V`

and
`\mathcal R`

are given in (SRB 2007). The definitions of the
empirical coefficients are given in the energy
`dcov`

topic.

For all values of the index in (0,2), under independence
the asymptotic distribution of `n\mathcal V_n^2`

is a quadratic form of centered Gaussian random variables,
with coefficients that depend on the distributions of `X`

and `Y`

. For the general problem of testing independence when the distributions of `X`

and `Y`

are unknown, the test based on `n\mathcal V^2_n`

can be implemented as a permutation test. See (SRB 2007) for
theoretical properties of the test, including statistical consistency.

`dcov.test`

or `dcor.test`

returns a list with class `htest`

containing

` method` |
description of test |

` statistic` |
observed value of the test statistic |

` estimate` |
dCov(x,y) or dCor(x,y) |

` estimates` |
a vector: [dCov(x,y), dCor(x,y), dVar(x), dVar(y)] |

` condition` |
logical, permutation test applied |

` replicates` |
replicates of the test statistic |

` p.value` |
approximate p-value of the test |

` n` |
sample size |

` data.name` |
description of data |

For the dcov test of independence,
the distance covariance test statistic is the V-statistic
`\mathrm{n\, dCov^2} = n \mathcal{V}_n^2`

(not dCov).

Maria L. Rizzo mrizzo@bgsu.edu and Gabor J. Szekely

Szekely, G.J., Rizzo, M.L., and Bakirov, N.K. (2007),
Measuring and Testing Dependence by Correlation of Distances,
*Annals of Statistics*, Vol. 35 No. 6, pp. 2769-2794.

doi: 10.1214/009053607000000505

Szekely, G.J. and Rizzo, M.L. (2009),
Brownian Distance Covariance,
*Annals of Applied Statistics*,
Vol. 3, No. 4, 1236-1265.

doi: 10.1214/09-AOAS312

Szekely, G.J. and Rizzo, M.L. (2009),
Rejoinder: Brownian Distance Covariance,
*Annals of Applied Statistics*, Vol. 3, No. 4, 1303-1308.

```
x <- iris[1:50, 1:4]
y <- iris[51:100, 1:4]
set.seed(1)
dcor.test(dist(x), dist(y), R=199)
set.seed(1)
dcov.test(x, y, R=199)
```

[Package *energy* version 1.7-10 Index]