GUniFrac {MiSPU} | R Documentation |
Generalized UniFrac distances for comparing microbial communities.
Description
A generalized version of commonly used UniFrac distances. It is defined as:
d^{(\alpha)} = \frac{\sum_{i=1}^m b_i (p^A_{i} + p^B_{i})^\alpha
\left\vert \frac{ p^A_{i} - p^B_{i} }{p^A_{i} + p^B_{i}} \right\vert } {
\sum_{i=1}^m b_i (p^A_{i} + p^B_{i})^\alpha},
where m
is the number of branches, b_i
is the length of
i
th branch, p^A_{i}, p^B_{i}
are the branch
proportion for community A and B.
Generalized UniFrac distance contains an extra parameter \alpha
controlling the weight on abundant lineages so the distance is not dominated
by highly abundant lineages. \alpha=0.5
has overall the best
power.
Usage
GUniFrac(otu.tab, tree,alpha = c(0,0.5,1))
Arguments
otu.tab |
OTU count table, row - n sample, column - q OTU |
tree |
Rooted phylogenetic tree of R class “phylo” |
alpha |
Parameter controlling weight on abundant lineages |
Value
Return a list containing
d0 |
UniFrac(0) |
d5 |
UniFrac(0.5) |
d1 |
UniFrac(1), weighted UniFrac |
or a list containing
GUniFrac |
The distance matrix for different alpha |
alpha |
The weight |
Note
The time consuming part is written in C and faster than the original one. The function only accepts rooted tree.
Author(s)
Chong Wu <chongwu@umn.edu>
References
Chen, Jun, et al (2012). "Associating microbiome composition with environmental covariates using generalized UniFrac distances." Bioinformatics 28(16):2106-2113.
Examples
data(throat.otu.tab)
data(throat.tree)
data(throat.meta)
groups <- throat.meta$SmokingStatus
# Calculate the UniFracs
unifracs <- GUniFrac(throat.otu.tab, throat.tree)
unifracs