rowDist {ChemoSpecUtils}R Documentation

Compute Distance Between Rows of a Matrix

Description

This function computes the distance between rows of a matrix using a number of methods. It is primarily a wrapper for Dist which provides many options. However, cosine distance is calculated locally. See the reference for an excellent summary of distances and similarities. Keep in mind that distances are always positive by definition. Further, in the literature one can find the same distance defined different ways. For instance, the definition of the "pearson" and "correlation" distances differs slightly between the reference below and Dist. So please study the definitions carefully to get the one you want. The example illustrates the behavior of some common distance definitions. Notice that "pearson" and "cosine" are mathemtically identical for the particular definition of "pearson" used by Dist.

Usage

rowDist(x, method)

Arguments

x

A matrix whose rows will be used for the distance calculation.

method

Character; one of "cosine", "euclidean", "maximum", "manhattan", "canberra", "binary", "pearson", "correlation", "spearman", "kendall", "abspearson", "abscorrelation".

Value

An object of class dist.

Author(s)

Bryan A. Hanson, DePauw University.

References

R. Todeschini, D. Ballabio, V. Consonni "Distances and Similarity Measures in Chemometrics and Chemoinformatics" in Encyclopedia of Analytical Chemistry Wiley and Sons, 2020 doi: 10.1002/9780470027318.a9438.pub2

Examples

# This examples imagines spectra as a series of vectors
# on a half unit circle.
# 1. Compute half of a unit circle
theta <- seq(0, pi, length = 100) 
x = cos(theta)
y = sin(theta)

# 2. Compute some illustrative vectors
# Get tail/origin & tip/head coordinates
lt <- length(theta)
set.seed(6)
tips <- theta[c(1, sample(2:100, 5))]
x0 <- y0 <- rep(0.0, lt) # tail/origin at 0,0
x1 <- cos(tips) # tips/heads
y1 <- sin(tips)

# 3. Compute the distance functions
# Bounded distances
RDcor <- rep(NA_real_, lt) # correlation distance
RDpea <- rep(NA_real_, lt) # pearson distance
RDabp <- rep(NA_real_, lt) # abspearson distance
RDcos <- rep(NA_real_, lt) # cosine distance

# Unbounded distances
RDeuc <- rep(NA_real_, lt) # Euclidean distance
RDman <- rep(NA_real_, lt) # manhattan distance

# Compute all
np <- 5
refVec <- c(seq(0.0, x[1], length.out = np), seq(0.0, y[1], length.out = np))
for (i in 1:lt) {
  Vec <- c(seq(0.0, x[i], length.out = np), seq(0.0, y[i], length.out = np))
  M <- matrix(c(refVec, Vec), nrow = 2, byrow = TRUE)
  RDman[i] <- rowDist(M, method = "manhattan")
  RDeuc[i] <- rowDist(M, method = "euclidean")
  RDcos[i] <- rowDist(M, method = "cosine")
  RDcor[i] <- rowDist(M, method = "correlation")
  RDpea[i] <- rowDist(M, method = "pearson")
  RDabp[i] <- rowDist(M, method = "abspearson")
}

# 4. Plots
# a. Unit circle w/representative vectors/spectra
plot.new()
plot.window(xlim = c(-1, 1), ylim = c(0, 1), asp = 1)
title(main = "Representative 'Spectral' Vectors on a Unit Half Circle\nReference Vector in Red",
  sub = "Each 'spectrum' is represented by a series of x, y points") 
lines(x, y, col = "gray") # draw half circle
lines(x = x[c(1,100)], y = y[c(1,100)], col = "gray") # line across bottom
arrows(x0, y0, x1, y1, angle = 5) # add arrows & a red reference vector
arrows(x0[1], y0[1], x1[1], y1[1], col = "red", angle = 5, lwd = 2)

# b. Distances
degrees <- theta*180/pi
plot(degrees, RDman, type = "l",
  xlab = "Angle Between Spectral Vectors and Reference Vector in Degrees",
  ylab = "Distance",
  main = "Spectral Distance Comparisons\nUsing ChemoSpecUtils::rowDist")
abline(h = c(1.0, 2.0), col = "gray")
lines(degrees, RDeuc, col = "blue")
lines(degrees, RDcos, col = "green", lwd = 4)
lines(degrees, RDcor, col = "red")
lines(degrees, RDabp, col = "black", lty = 2)
lines(degrees, RDpea, col = "black", lty = 3)
leg.txt <- c("manhattan", "euclidean", "correlation", "cosine", "pearson", "abspearson")
leg.col <- c("black", "blue", "red", "green", "black", "black")
leg.lwd <- c(1, 1, 1, 4, 1, 1)
leg.lty <- c(1, 1, 1, 1, 3, 2)
legend("topleft", legend = leg.txt, col = leg.col, lwd = leg.lwd, lty = leg.lty)



[Package ChemoSpecUtils version 0.4.96 Index]