bcor {Ball} | R Documentation |
Ball Covariance and Correlation Statistics
Description
Computes Ball Covariance and Ball Correlation statistics, which are generic dependence measures in Banach spaces.
Usage
bcor(x, y, distance = FALSE, weight = FALSE)
bcov(x, y, distance = FALSE, weight = FALSE)
Arguments
x |
a numeric vector, matrix, data.frame, or a list containing at least two numeric vectors, matrices, or data.frames. |
y |
a numeric vector, matrix, or data.frame. |
distance |
if |
weight |
a logical or character string used to choose the weight form of Ball Covariance statistic..
If input is a character string, it must be one of |
Details
The sample sizes of the two variables must agree, and samples must not contain missing and infinite values.
If we set distance = TRUE
, arguments x
, y
can be a dist
object or a
symmetric numeric matrix recording distance between samples; otherwise, these arguments are treated as data.
bcov
and bcor
compute Ball Covariance and Ball Correlation statistics.
Ball Covariance statistics is a generic dependence measure in Banach spaces. It enjoys the following properties:
It is nonnegative and it is equal to zero if and only if variables are unassociated;
It is highly robust;
It is distribution-free and model-free;
it is interesting that the HHG is a special case of Ball Covariance statistics.
Ball correlation statistics, a normalized version of Ball Covariance statistics, generalizes Pearson correlation in two fundamental ways:
It is well-defined for random variables in arbitrary dimension in Banach spaces
BCor is equal to zero implies random variables are unassociated.
The definitions of the Ball Covariance and Ball Correlation statistics between two random variables are as follows.
Suppose, we are given pairs of independent observations
\{(x_1, y_1),...,(x_n,y_n)\}
, where x_i
and y_i
can be of any dimension
and the dimensionality of x_i
and y_i
need not be the same.
Then, we define sample version Ball Covariance as:
\mathbf{BCov}_{\omega, n}^{2}(X, Y)=\frac{1}{n^{2}}\sum_{i,j=1}^{n}{(\Delta_{ij,n}^{X,Y}-\Delta_{ij,n}^{X}\Delta_{ij,n}^{Y})^{2}}
where:
\Delta_{ij,n}^{X,Y}=\frac{1}{n}\sum_{k=1}^{n}{\delta_{ij,k}^{X} \delta_{ij,k}^{Y}},
\Delta_{ij,n}^{X}=\frac{1}{n}\sum_{k=1}^{n}{\delta_{ij,k}^{X}},
\Delta_{ij,n}^{Y}=\frac{1}{n}\sum_{k=1}^{n}{\delta_{ij,k}^{Y}}
\delta_{ij,k}^{X} = I(x_{k} \in \bar{B}(x_{i}, \rho(x_{i}, x_{j}))),
\delta_{ij,k}^{Y} = I(y_{k} \in \bar{B}(y_{i}, \rho(y_{i}, y_{j})))
Among them, \bar{B}(x_{i}, \rho(x_{i}, x_{j}))
is a closed ball
with center x_{i}
and radius \rho(x_{i}, x_{j})
.
Similarly, we can define \mathbf{BCov}_{\omega,n}^2(\mathbf{X},\mathbf{X})
and \mathbf{BCov}_{\omega,n}^2(\mathbf{Y},\mathbf{Y})
.
We define Ball Correlation statistic as follows.
\mathbf{BCor}_{\omega,n}^2(\mathbf{X},\mathbf{Y})=
\mathbf{BCov}_{\omega,n}^2(\mathbf{X},\mathbf{Y})/\sqrt{\mathbf{BCov}_{\omega,n}^2(\mathbf{X},\mathbf{X})\mathbf{BCov}_{\omega,n}^2(\mathbf{Y},\mathbf{Y})}
We can extend \mathbf{BCov}_{\omega,n}
to measure the mutual independence between K
random variables:
\frac{1}{n^{2}}\sum_{i,j=1}^{n}{\left[ (\Delta_{ij,n}^{X_{1}, ..., X_{K}}-\prod_{k=1}^{K}\Delta_{ij,n}^{X_{k}})^{2}\prod_{k=1}^{K}{\hat{\omega}_{k}(X_{ki},X_{kj})} \right]}
where X_{k}(k=1,\ldots,K)
are random variables and X_{ki}
is the i
-th observations of X_{k}
.
See bcov.test
for a test of independence based on the Ball Covariance statistic.
Value
bcor |
Ball Correlation statistic. |
bcov |
Ball Covariance statistic. |
References
Wenliang Pan, Xueqin Wang, Heping Zhang, Hongtu Zhu & Jin Zhu (2019) Ball Covariance: A Generic Measure of Dependence in Banach Space, Journal of the American Statistical Association, DOI: 10.1080/01621459.2018.1543600
Wenliang Pan, Xueqin Wang, Weinan Xiao & Hongtu Zhu (2018) A Generic Sure Independence Screening Procedure, Journal of the American Statistical Association, DOI: 10.1080/01621459.2018.1462709
Jin Zhu, Wenliang Pan, Wei Zheng, and Xueqin Wang (2021). Ball: An R Package for Detecting Distribution Difference and Association in Metric Spaces, Journal of Statistical Software, Vol.97(6), doi: 10.18637/jss.v097.i06.
See Also
Examples
############# Ball Correlation #############
num <- 50
x <- 1:num
y <- 1:num
bcor(x, y)
bcor(x, y, weight = "prob")
bcor(x, y, weight = "chisq")
############# Ball Covariance #############
num <- 50
x <- rnorm(num)
y <- rnorm(num)
bcov(x, y)
bcov(x, y, weight = "prob")
bcov(x, y, weight = "chisq")