rdist {rdist} | R Documentation |
rdist: an R package for distances
Description
rdist
provide a common framework to calculate distances. There are three main functions:
-
rdist
computes the pairwise distances between observations in one matrix and returns adist
object, -
pdist
computes the pairwise distances between observations in one matrix and returns amatrix
, and -
cdist
computes the distances between observations in two matrices and returns amatrix
.
In particular the cdist
function is often missing in other distance functions. All
calculations involving NA
values will consistently return NA
.
Usage
rdist(X, metric = "euclidean", p = 2L)
pdist(X, metric = "euclidean", p = 2)
cdist(X, Y, metric = "euclidean", p = 2)
Arguments
X , Y |
A matrix |
metric |
The distance metric to use |
p |
The power of the Minkowski distance |
Details
Available distance measures are (written for two vectors v and w):
-
"euclidean"
:\sqrt{\sum_i(v_i - w_i)^2}
-
"minkowski"
:(\sum_i|v_i - w_i|^p)^{1/p}
-
"manhattan"
:\sum_i(|v_i-w_i|)
-
"maximum"
or"chebyshev"
:\max_i(|v_i-w_i|)
-
"canberra"
:\sum_i(\frac{|v_i-w_i|}{|v_i|+|w_i|})
-
"angular"
:\cos^{-1}(cor(v, w))
-
"correlation"
:\sqrt{\frac{1-cor(v, w)}{2}}
-
"absolute_correlation"
:\sqrt{1-|cor(v, w)|^2}
-
"hamming"
:(\sum_i v_i \neq w_i) / \sum_i 1
-
"jaccard"
:(\sum_i v_i \neq w_i) / \sum_i 1_{v_i \neq 0 \cup w_i \neq 0}
Any function that defines a distance between two vectors.