sampDis {chemodiv} | R Documentation |
Calculate sample dissimilarities
Description
Function to calculate dissimilarities between samples. Either Bray-Curtis dissimilarities and/or Generalized UniFrac dissimilarities are calculated.
Usage
sampDis(sampleData, compDisMat = NULL, type = "BrayCurtis", alpha = 1)
Arguments
sampleData |
Data frame with the relative concentration of each compound (column) in every sample (row). |
compDisMat |
Compound dissimilarity matrix, as calculated by
|
type |
Type of sample dissimilarities to be calculated. This is
Bray-Curtis dissimilarities, |
alpha |
Parameter used in calculations of Generalized UniFrac
dissimilarities. alpha can be set between 0 and 1.
With |
Details
The function calculates a dissimilarity matrix for all the samples
in sampleData
, for the given dissimilarity index/indices.
Bray-Curtis dissimilarities are calculated using only
the sampleData
. This is the most commonly calculated dissimilarity
index used for phytochemical data (other types of dissimilarities are
easily calculated using the vegdist
function in
the vegan
package).
If a compound dissimilarity matrix, compDisMat
, is supplied,
Generalized UniFrac dissimilarities can be calculated, which also
use the compound dissimilarity matrix for the sample dissimilarity
calculations. For the calculation of Generalized UniFrac
dissimilarities (Chen et al. 2012), the compound dissimilarity matrix is
transformed into a dendrogram using hierarchical clustering (with the
UPGMA method). Calculations of UniFrac dissimilarities quantifies the
fraction of the total branch length of the dendrogram that leads to
compounds present in either sample, but not both. The (weighted) Generalized
UniFrac dissimilarities implemented here additionally take compound
abundances into account. In this way, both the relative proportions of
compounds and the biosynthetic/structural dissimilarities of the compounds
are accounted for in the calculations of sample dissimilarities, such that
two samples containing more biosynthetically/structurally different
compounds have a higher pairwise dissimilarity than two samples
containing more biosynthetically/structurally similar compounds.
As with Bray-Curtis dissimilarities, Generalized UniFrac dissimilarities
range in value from 0 to 1.
Value
List with sample dissimilarity matrices. A list is always outputted, even if only one matrix is calculated.
References
Bray JR, Curtis JT. 1957. An Ordination of the Upland Forest Communities of Southern Wisconsin. Ecological Monographs 27: 325-349.
Chen J, Bittinger K, Charlson ES, et al. 2012. Associating microbiome composition with environmental covariates using generalized UniFrac distances. Bioinformatics 28: 2106-2113.
Lozupone C, Knight R. 2005. UniFrac: a New Phylogenetic Method for Comparing Microbial Communities. Applied and Environmental Microbiology 71: 8228-8235.
Examples
data(minimalSampData)
data(minimalCompDis)
sampDis(minimalSampData)
sampDis(sampleData = minimalSampData, compDisMat = minimalCompDis,
type = c("BrayCurtis", "GenUniFrac"), alpha = 0.5)
data(alpinaSampData)
data(alpinaCompDis)
sampDis(sampleData = alpinaSampData, compDisMat = alpinaCompDis,
type = "GenUniFrac")