rowDist {ChemoSpecUtils} | R Documentation |
Compute Distance Between Rows of a Matrix
Description
This function computes the distance between rows of a matrix using a number of methods.
It is primarily a wrapper for Dist
which provides many options.
However, cosine distance is calculated locally.
See the reference for an excellent summary of distances and similarities.
Keep in mind that distances are always positive by definition. Further, in the literature one
can find the same distance defined different ways. For instance, the definition of the
"pearson"
and "correlation"
distances differs slightly between the reference below
and Dist
. So please study the definitions carefully to get the one you want.
The example illustrates the behavior of some common distance definitions. Notice that "pearson"
and "cosine"
are mathematically identical for the particular definition of "pearson"
used by Dist
.
Usage
rowDist(x, method)
Arguments
x |
A matrix whose rows will be used for the distance calculation. |
method |
Character; one of |
Value
An object of class dist
.
Author(s)
Bryan A. Hanson (DePauw University).
References
R. Todeschini, D. Ballabio, V. Consonni "Distances and Similarity Measures in Chemometrics and Chemoinformatics" in Encyclopedia of Analytical Chemistry Wiley and Sons, 2020 doi:10.1002/9780470027318.a9438.pub2
Examples
# You need to install package "amap" to run the examples
if (requireNamespace("amap", quietly = TRUE)) {
# These examples imagines spectra as a series of vectors
# on a half unit circle.
# 1. Compute half of a unit circle
theta <- seq(0, pi, length = 100)
x = cos(theta)
y = sin(theta)
# 2. Compute some illustrative vectors
# Get tail/origin & tip/head coordinates
lt <- length(theta)
set.seed(6)
tips <- theta[c(1, sample(2:100, 5))]
x0 <- y0 <- rep(0.0, lt) # tail/origin at 0,0
x1 <- cos(tips) # tips/heads
y1 <- sin(tips)
# 3. Compute the distance functions
# Bounded distances
RDcor <- rep(NA_real_, lt) # correlation distance
RDpea <- rep(NA_real_, lt) # pearson distance
RDabp <- rep(NA_real_, lt) # abspearson distance
RDcos <- rep(NA_real_, lt) # cosine distance
# Unbounded distances
RDeuc <- rep(NA_real_, lt) # Euclidean distance
RDman <- rep(NA_real_, lt) # manhattan distance
# Compute all
np <- 5
refVec <- c(seq(0.0, x[1], length.out = np), seq(0.0, y[1], length.out = np))
for (i in 1:lt) {
Vec <- c(seq(0.0, x[i], length.out = np), seq(0.0, y[i], length.out = np))
M <- matrix(c(refVec, Vec), nrow = 2, byrow = TRUE)
RDman[i] <- rowDist(M, method = "manhattan")
RDeuc[i] <- rowDist(M, method = "euclidean")
RDcos[i] <- rowDist(M, method = "cosine")
RDcor[i] <- rowDist(M, method = "correlation")
RDpea[i] <- rowDist(M, method = "pearson")
RDabp[i] <- rowDist(M, method = "abspearson")
}
# 4. Plots
# a. Unit circle w/representative vectors/spectra
plot.new()
plot.window(xlim = c(-1, 1), ylim = c(0, 1), asp = 1)
title(main = "Representative 'Spectral' Vectors on a Unit Half Circle\nReference Vector in Red",
sub = "Each 'spectrum' is represented by a series of x, y points")
lines(x, y, col = "gray") # draw half circle
lines(x = x[c(1,100)], y = y[c(1,100)], col = "gray") # line across bottom
arrows(x0, y0, x1, y1, angle = 5) # add arrows & a red reference vector
arrows(x0[1], y0[1], x1[1], y1[1], col = "red", angle = 5, lwd = 2)
# b. Distances
degrees <- theta*180/pi
plot(degrees, RDman, type = "l",
xlab = "Angle Between Spectral Vectors and Reference Vector in Degrees",
ylab = "Distance",
main = "Spectral Distance Comparisons\nUsing ChemoSpecUtils::rowDist")
abline(h = c(1.0, 2.0), col = "gray")
lines(degrees, RDeuc, col = "blue")
lines(degrees, RDcos, col = "green", lwd = 4)
lines(degrees, RDcor, col = "red")
lines(degrees, RDabp, col = "black", lty = 2)
lines(degrees, RDpea, col = "black", lty = 3)
leg.txt <- c("manhattan", "euclidean", "correlation", "cosine", "pearson", "abspearson")
leg.col <- c("black", "blue", "red", "green", "black", "black")
leg.lwd <- c(1, 1, 1, 4, 1, 1)
leg.lty <- c(1, 1, 1, 1, 3, 2)
legend("topleft", legend = leg.txt, col = leg.col, lwd = leg.lwd, lty = leg.lty)
}