AllSplitPairings {TreeDist} | R Documentation |
Variation of information for all split pairings
Description
Calculate the variation of clustering information (Meila 2007) for each possible pairing of non-trivial splits on n leaves (Smith 2020), tabulating the number of pairings with each similarity.
Usage
AllSplitPairings(n)
Arguments
n |
Integer specifying the number of leaves in a tree. |
Value
AllSplitPairings()
returns a named vector. The name of each
element corresponds to a certain variation of information, in bits; the
value of each element specifies the number of pairings of non-trivial
splits that give rise to that variation of information.
Split AB|CD
is treated as distinct from CD|AB
. If pairing
AB|CD
=CD|AB
is considered equivalent to CD|AB
=CD|AB
(etc), then
values should be divided by four.
Author(s)
Martin R. Smith (martin.smith@durham.ac.uk)
References
Meila M (2007).
“Comparing clusterings—an information based distance.”
Journal of Multivariate Analysis, 98(5), 873–895.
doi:10.1016/j.jmva.2006.11.013.
Smith MR (2020).
“Information theoretic Generalized Robinson-Foulds metrics for comparing phylogenetic trees.”
Bioinformatics, 36(20), 5007–5013.
doi:10.1093/bioinformatics/btaa614.
Examples
AllSplitPairings(6)
# Treat equivalent splits as identical by dividing by four:
AllSplitPairings(6) / 4L