mediandiff {eummd} | R Documentation |
Compute the median difference between pairs of values
Description
Compute the median of all differences between distinct pairs in vectors or matrices.
Usage
mediandiff(X, Y = NULL, kernel = c("Laplacian", "Gaussian"), fast = FALSE)
Arguments
X |
Numeric vector or matrix of length |
Y |
Numeric vector or matrix of length |
kernel |
String, either |
fast |
Boolean; if |
Details
The median difference is defined as follows:
is the combined
and
values into a single vector
or matrix.
Number of columns is the dimension, and these need to be equal
for
and
. Then, if the Laplacian kernel is used,
where is the 1-norm, and so if the data
are
d
-dimensional then
If the Gaussian kernel is specified, then the square of the two-norm is used.
The median heuristic is defined as beta = 1/m
.
Naive method will compute all distinct pairs, of which there are
differences. These are then sorted using
a
algorithm, so overall
.
The fast method is is from Croux and Rousseeuw (1992),
which is based on Johnson and Mizoguchi (1978).
Value
A scalar, the median of all pairwise differences.
References
Croux, C. and Rousseeuw, P. J. (1992), "Time-Efficient Algorithms for Two Highly Robust Estimators of Scale" In Computational Statistics: Volume 1: Proceedings of the 10th Symposium on Computational Statistics (pp. 411-428).
Johnson, D.B., and Mizoguchi, T. (1978), "Selecting the Kth Element in X + Y and X_1 + X_2 + ... + X_m", SIAM Journal of Computing, 7, 147-153.
See Also
medianheuristic
Examples
X <- c(7.1, 1.2, 4.3, 0.4)
Y <- c(5.5, 2.6, 8.7)
#using fast method, Laplacian kernel, loglinear in number of observations
md <- mediandiff(X, Y, fast=TRUE)
#using fast method, Gaussian kernel, loglinear in number of observations
md <- mediandiff(X, Y, fast=TRUE, kernel="Gaussian")
#using naive method (default), with Laplacian kernel
md <- mediandiff(X, Y)