fluidity {micropan} | R Documentation |
Computing genomic fluidity for a pan-genome
Description
Computes the genomic fluidity, which is a measure of population diversity.
Usage
fluidity(pan.matrix, n.sim = 10)
Arguments
pan.matrix |
A pan-matrix, see |
n.sim |
An integer specifying the number of random samples to use in the computations. |
Details
The genomic fluidity between two genomes is defined as the number of unique gene families divided by the total number of gene families (Kislyuk et al, 2011). This is averaged over ‘n.sim’ random pairs of genomes to obtain a population estimate.
The genomic fluidity between two genomes describes their degree of overlap with respect to gene
cluster content. If the fluidity is 0.0, the two genomes contain identical gene clusters. If it
is 1.0 the two genomes are non-overlapping. The difference between a Jaccard distance (see
distJaccard
) and genomic fluidity is small, they both measure overlap between
genomes, but fluidity is computed for the population by averaging over many pairs, while Jaccard
distances are computed for every pair. Note that only presence/absence of gene clusters are
considered, not multiple occurrences.
The input ‘pan.matrix’ is typically constructed by panMatrix
.
Value
A vector with two elements, the mean fluidity and its sample standard deviation over the ‘n.sim’ computed values.
Author(s)
Lars Snipen and Kristian Hovde Liland.
References
Kislyuk, A.O., Haegeman, B., Bergman, N.H., Weitz, J.S. (2011). Genomic fluidity: an integrative view of gene diversity within microbial populations. BMC Genomics, 12:32.
See Also
Examples
# Loading a pan-matrix in this package
data(xmpl.panmat)
# Fluidity based on this pan-matrix
fluid <- fluidity(xmpl.panmat)