hammingdists {cultevo}R Documentation

Pairwise Hamming distances between matrix rows.

Description

Returns a distance matrix giving all pairwise Hamming distances between the rows of its argument meanings, which can be a matrix, data frame or vector. Vectors are treated as matrices with a single column, so the distances in its return value can only be 0 or 1.

Usage

hammingdists(meanings)

Arguments

meanings

a matrix with the different dimensions encoded along columns, and all combinations of meanings specified along rows. The data type of the cells does not matter since distance is simply based on equality (with the exception of NA values, see below.

Details

This function behaves differently from calling dist(meanings, method="manhattan") in how NA values are treated: specifying a meaning component as NA allows you to ignore that dimension for the given row/meaning combinations, (instead of counting a difference between NA and another value as a distance of 1).

Value

A distance matrix of type dist with n*(n-1)/2 rows/columns, where n is the number of rows in meanings.

See Also

dist

Examples

# a 2x2 design using strings
print(strings <- matrix(c("a1", "b1", "a1", "b2", "a2", "b1", "a2", "b2"),
  ncol=2, byrow=TRUE))
hammingdists(strings)

# a 2x3 design using integers
print(integers <- matrix(c(0, 0, 0, 1, 0, 2, 1, 0, 1, 1, 1, 2), ncol=2, byrow=TRUE))
hammingdists(integers)

# a 3x2 design using factors (ncol is always the number of dimensions)
print(factors <- data.frame(colour=c("red", "red", "green", "blue"),
                            animal=c("dog", "cat", "dog", "cat")))
hammingdists(factors)

# if some meaning dimension is not relevant for some combinations of
# meanings (e.g. optional arguments), specifying them as NA in the matrix
# will make them not be counted towards the hamming distance! in this
# example the value of the second dimension does not matter (and does not
# count towards the distance) when the the first dimension has value '1'
print(ignoredimension <- matrix(c(0, 0, 0, 1, 1, NA), ncol=2, byrow=TRUE))
hammingdists(ignoredimension)

# trivial case of a vector: first and last two elements are identical,
# otherwise a difference of one
hammingdists(c(0, 0, 1, 1))

[Package cultevo version 1.0.2 Index]