bd {Ball} | R Documentation |
Ball Divergence statistic
Description
Compute Ball Divergence statistic, which is a generic dispersion measure in Banach spaces.
Usage
bd(
x,
y = NULL,
distance = FALSE,
size = NULL,
num.threads = 1,
kbd.type = c("sum", "maxsum", "max")
)
Arguments
x |
a numeric vector, matrix, data.frame, or a list containing at least two numeric vectors, matrices, or data.frames. |
y |
a numeric vector, matrix, data.frame. |
distance |
if |
size |
a vector recording sample size of each group. |
num.threads |
number of threads. If |
kbd.type |
a character string specifying the |
Details
Given the samples not containing missing values, bd
returns Ball Divergence statistics.
If we set distance = TRUE
, arguments x
, y
can be a dist
object or a
symmetric numeric matrix recording distance between samples;
otherwise, these arguments are treated as data.
Ball divergence statistic measure the distribution difference of two datasets in Banach spaces. The Ball divergence statistic is proven to be zero if and only if two datasets are identical.
The definition of the Ball Divergence statistics is as follows.
Given two independent samples with the associated probability measure
and
with
, where the observations in each sample are i.i.d.
Let
,
where
indicates whether
is located in the closed ball
with center
and radius
.
We denote:
represents the proportion of samples
located in the
ball
and
represents the proportion of samples
located in the ball
.
Meanwhile,
and
represent the corresponding proportions located in the ball
.
The Ball Divergence statistic is defined as:
Ball Divergence can be generalized to the K-sample test problem. Suppose we
have group samples, each group include
samples.
The definition of
-sample Ball Divergence statistic could be
to directly sum up the two-sample Ball Divergence statistics of all sample pairs (
kbd.type = "sum"
)
or to find one sample with the largest difference to the others (kbd.type = "maxsum"
)
to aggregate the most significant different two-sample Ball Divergence statistics (
kbd.type = "max"
)
where are the largest
two-sample Ball Divergence statistics among
. When
,
the three types of Ball Divergence statistics degenerate into two-sample Ball Divergence statistic.
See bd.test
for a test of distribution equality based on the Ball Divergence.
Value
bd |
Ball Divergence statistic |
Author(s)
Wenliang Pan, Yuan Tian, Xueqin Wang, Heping Zhang
References
Wenliang Pan, Yuan Tian, Xueqin Wang, Heping Zhang. Ball Divergence: Nonparametric two sample test. Ann. Statist. 46 (2018), no. 3, 1109–1137. doi:10.1214/17-AOS1579. https://projecteuclid.org/euclid.aos/1525313077
See Also
Examples
############# Ball Divergence #############
x <- rnorm(50)
y <- rnorm(50)
bd(x, y)