Hamming {TreeTools} | R Documentation |
Hamming distance between taxa in a phylogenetic dataset
Description
The Hamming distance between a pair of taxa is the number of characters with a different coding, i.e. the smallest number of evolutionary steps that must have occurred since their common ancestor.
Usage
Hamming(
dataset,
ratio = TRUE,
ambig = c("median", "mean", "zero", "one", "na", "nan")
)
Arguments
dataset |
Object of class |
ratio |
Logical specifying whether to weight distance against maximum possible, given that a token that is ambiguous in either of two taxa cannot contribute to the total distance between the pair. |
ambig |
Character specifying value to return when a pair of taxa
have a zero maximum distance (perhaps due to a preponderance of ambiguous
tokens).
"median", the default, take the median of all other distance values;
"mean", the mean;
"zero" sets to zero; "one" to one;
"NA" to |
Details
Tokens that contain the inapplicable state are treated as requiring no steps to transform into any applicable token.
Value
Hamming()
returns an object of class dist
listing the Hamming
distance between each pair of taxa.
Author(s)
Martin R. Smith (martin.smith@durham.ac.uk)
See Also
Used to construct neighbour joining trees in NJTree()
.
dist.hamming()
in the phangorn package provides an alternative
implementation.
Examples
tokens <- matrix(c(0, 0, "0", 0, "?",
0, 0, "1", 0, 1,
0, 0, "1", 0, 1,
0, 0, "2", 0, 1,
1, 1, "-", "?", 0,
1, 1, "2", 1, "{01}"),
nrow = 6, ncol = 5, byrow = TRUE,
dimnames = list(
paste0("Taxon_", LETTERS[1:6]),
paste0("Char_", 1:5)))
dataset <- MatrixToPhyDat(tokens)
Hamming(dataset)