hammingdists {cultevo} | R Documentation |
Pairwise Hamming distances between matrix rows.
Description
Returns a distance matrix giving all pairwise Hamming distances between the
rows of its argument meanings
, which can be a matrix, data frame or
vector. Vectors are treated as matrices with a single column, so the
distances in its return value can only be 0 or 1.
Usage
hammingdists(meanings)
Arguments
meanings |
a matrix with the different dimensions encoded along
columns, and all combinations of meanings specified along rows. The data
type of the cells does not matter since distance is simply based on
equality (with the exception of |
Details
This function behaves differently from calling
dist(meanings, method="manhattan")
in how NA
values are treated: specifying a meaning component as NA
allows you
to ignore that dimension for the given row/meaning combinations,
(instead of counting a difference between NA
and another value as a
distance of 1).
Value
A distance matrix of type dist
with n*(n-1)/2
rows/columns, where n is the number of rows in meanings
.
See Also
Examples
# a 2x2 design using strings
print(strings <- matrix(c("a1", "b1", "a1", "b2", "a2", "b1", "a2", "b2"),
ncol=2, byrow=TRUE))
hammingdists(strings)
# a 2x3 design using integers
print(integers <- matrix(c(0, 0, 0, 1, 0, 2, 1, 0, 1, 1, 1, 2), ncol=2, byrow=TRUE))
hammingdists(integers)
# a 3x2 design using factors (ncol is always the number of dimensions)
print(factors <- data.frame(colour=c("red", "red", "green", "blue"),
animal=c("dog", "cat", "dog", "cat")))
hammingdists(factors)
# if some meaning dimension is not relevant for some combinations of
# meanings (e.g. optional arguments), specifying them as NA in the matrix
# will make them not be counted towards the hamming distance! in this
# example the value of the second dimension does not matter (and does not
# count towards the distance) when the the first dimension has value '1'
print(ignoredimension <- matrix(c(0, 0, 0, 1, 1, NA), ncol=2, byrow=TRUE))
hammingdists(ignoredimension)
# trivial case of a vector: first and last two elements are identical,
# otherwise a difference of one
hammingdists(c(0, 0, 1, 1))