cor.fk {pcaPP} | R Documentation |
Fast estimation of Kendall's tau rank correlation coefficient
Description
Calculates Kendall's tau rank correlation coefficient in O (n log (n)) rather than O (n^2) as in the current R implementation.
Usage
cor.fk (x, y = NULL)
Arguments
x |
A vector, a matrix or a data frame of data. |
y |
A vector of data. |
Details
The code of this implementation of the fast Kendall's tau correlation algorithm has
originally been published by David Simcha.
Due to it's runtime (O(n log n)
) it's essentially faster than the
current R implementation (O (n\^2)
), especially for large numbers of
observations.
The algorithm goes back to Knight (1966) and has been described more detailed
by Abrevaya (1999) and Christensen (2005).
Value
The estimated correlation coefficient.
Author(s)
David Simcha, Heinrich Fritz, Christophe Croux, Peter Filzmoser <P.Filzmoser@tuwien.ac.at>
References
Knight, W. R. (1966). A Computer Method for Calculating Kendall's Tau with Ungrouped Data.
Journal of the American Statistical Association, 314(61) Part 1, 436-439.
Christensen D. (2005). Fast algorithms for the calculation of Kendall's Tau.
Journal of Computational Statistics 20, 51-62.
Abrevaya J. (1999). Computation of the Maximum Rank Correlation Estimator.
Economic Letters 62, 279-285.
See Also
Examples
set.seed (100) ## creating test data
n <- 1000
x <- rnorm (n)
y <- x+ rnorm (n)
tim <- proc.time ()[1] ## applying cor.fk
cor.fk (x, y)
cat ("cor.fk runtime [s]:", proc.time ()[1] - tim, "(n =", length (x), ")\n")
tim <- proc.time ()[1] ## applying cor (standard R implementation)
cor (x, y, method = "kendall")
cat ("cor runtime [s]:", proc.time ()[1] - tim, "(n =", length (x), ")\n")
## applying cor and cor.fk on data containing
Xt <- cbind (c (x, as.integer (x)), c (y, as.integer (y)))
tim <- proc.time ()[1] ## applying cor.fk
cor.fk (Xt)
cat ("cor.fk runtime [s]:", proc.time ()[1] - tim, "(n =", nrow (Xt), ")\n")
tim <- proc.time ()[1] ## applying cor (standard R implementation)
cor (Xt, method = "kendall")
cat ("cor runtime [s]:", proc.time ()[1] - tim, "(n =", nrow (Xt), ")\n")