dcorr {metrica} | R Documentation |
Distance Correlation
Description
It estimates the Distance Correlation coefficient (dcorr) for a continuous predicted-observed dataset.
Usage
dcorr(data = NULL, obs, pred, tidy = FALSE, na.rm = TRUE)
Arguments
data |
(Optional) argument to call an existing data frame containing the data. |
obs |
Vector with observed values (numeric). |
pred |
Vector with predicted values (numeric). |
tidy |
logical operator (TRUE/FALSE) to decide the type of return. TRUE returns a data.frame, FALSE returns a list (default). |
na.rm |
Logic argument to remove rows with missing values (NA). Default is na.rm = TRUE. |
Details
The dcorr function is a wrapper for the dcor
function
from the energy-package. See Rizzo & Szekely (2022). The distance
correlation (dcorr) coefficient is a novel measure of dependence
between random vectors introduced by Szekely et al. (2007).
The dcorr is characterized for being symmetric, which is relevant for the predicted-observed case (PO).
For all distributions with finite first moments, distance correlation
\mathcal R
generalizes the idea of correlation in two fundamental ways:
(1) \mathcal R(P,O)
is defined for P
and O
in arbitrary
dimension.
(2) \mathcal R(P,O)=0
characterizes independence of P
and
O
.
Distance correlation satisfies 0 \le \mathcal R \le 1
, and
\mathcal R = 0
only if P
and O
are independent. Distance
covariance \mathcal V
provides a new approach to the problem of
testing the joint independence of random vectors. The formal definitions of the
population coefficients \mathcal V
and
\mathcal R
are given in Szekely et al. (2007).
The empirical distance correlation \mathcal{R}_n(\mathbf{P,O})
is
the square root of
\mathcal{R}^2_n(\mathbf{P,O})= \frac {\mathcal{V}^2_n(\mathbf{P,O})}
{\sqrt{ \mathcal{V}^2_n (\mathbf{P}) \mathcal{V}^2_n(\mathbf{O})}}.
For the formula and more details, see online-documentation and the energy-package
Value
an object of class numeric
within a list
(if tidy = FALSE) or within a
data frame
(if tidy = TRUE).
References
Szekely, G.J., Rizzo, M.L., and Bakirov, N.K. (2007).
Measuring and testing dependence by correaltion of distances. Annals of Statistics, Vol. 35(6): 2769-2794.
doi:10.1214/009053607000000505.
Rizzo, M., and Szekely, G. (2022).
energy: E-Statistics: Multivariate Inference via the Energy of Data.
R package version 1.7-10.
https://CRAN.R-project.org/package=energy.
See Also
eval_tidy
, defusing-advanced
dcor
, energy
Examples
set.seed(1)
P <- rnorm(n = 100, mean = 0, sd = 10)
O <- P + rnorm(n=100, mean = 0, sd = 3)
dcorr(obs = P, pred = O)