kendall.tau {VGAM} | R Documentation |
Kendall's Tau Statistic
Description
Computes Kendall's Tau, which is a rank-based correlation measure, between two vectors.
Usage
kendall.tau(x, y, exact = FALSE, max.n = 3000)
Arguments
x , y |
Numeric vectors. Must be of equal length.
Ideally their values are continuous and not too discrete.
Let |
exact |
Logical. If |
max.n |
Numeric. If |
Details
Kendall's tau is a measure of dependency in a
bivariate distribution.
Loosely, two random variables are concordant
if large values
of one random variable are associated with large
values of the
other random variable.
Similarly, two random variables are disconcordant
if large values
of one random variable are associated with small values of the
other random variable.
More formally, if (x[i] - x[j])*(y[i] - y[j]) > 0
then
that comparison is concordant (i \neq j)
.
And if (x[i] - x[j])*(y[i] - y[j]) < 0
then
that comparison is disconcordant (i \neq j)
.
Out of choose(N, 2
) comparisons,
let c
and d
be the
number of concordant and disconcordant pairs.
Then Kendall's tau can be estimated by (c-d)/(c+d)
.
If there are ties then half the ties are deemed concordant and
half disconcordant so that (c-d)/(c+d+t)
is used.
Value
Kendall's tau, which lies between -1
and 1
.
Warning
If length(x)
is large then
the cost is O(N^2)
, which is expensive!
Under these circumstances
it is not advisable to set exact = TRUE
or max.n
to a very
large number.
See Also
Examples
N <- 5000; x <- 1:N; y <- runif(N)
true.rho <- -0.8
ymat <- rbinorm(N, cov12 = true.rho) # Bivariate normal, aka N_2
x <- ymat[, 1]
y <- ymat[, 2]
## Not run: plot(x, y, col = "blue")
kendall.tau(x, y) # A random sample is taken here
kendall.tau(x, y) # A random sample is taken here
kendall.tau(x, y, exact = TRUE) # Costly if length(x) is large
kendall.tau(x, y, max.n = N) # Same as exact = TRUE
(rhohat <- sin(kendall.tau(x, y) * pi / 2)) # Holds for N_2 actually
true.rho # rhohat should be near this value