dmanova {GUniFrac}R Documentation

Distance-based Multivariate Analysis of Variance (Analytical P-value Calculation)

Description

Analysis of variance using distance matrices — for partitioning distance matrices among sources of variation and fitting linear models (e.g., factors, polynomial regression) to distance matrices; calculate the analytical p-value based on pseudo-F statistic without permutation.

Usage

dmanova(formula, data = NULL, positify = FALSE,
		contr.unordered = "contr.sum", contr.ordered = "contr.poly", 
		returnG = FALSE)

Arguments

formula

model formula. The LHS must be a dissimilarity matrix (either class matrix or class dist, e.g., from vegdist or dist. The RHS defines the independent variables. These can be continuous variables or factors, they can be transformed within the formula, and they can have interactions as in a typical formula.

data

the data frame for the independent variables.

positify

a logical value indicating whether to make the Gower's matrix positive definite using the nearPD function in Matrix package. This is equivalent to modifying the distance matrix so that it has an Euclidean embedding.

contr.unordered, contr.ordered

contrasts used for the design matrix (default in R is dummy or treatment contrasts for unordered factors).

returnG

a logical value indicating whether the Gower's matrix should be returned.

Details

dmanova is a permutation-free method for approximating the p-value from distance-based permutational multivariate analysis of variance (PERMANOVA). PERMANOVA is slow when the sample size is large. In contrast, dmanova provides an analytical solution, which is several orders of magnitude faster for large sample sizes. The covariate of interest should be put as the last term in formula while the variables to be adjusted are put before the covariate of interest.

Value

Function dmanova returns a list with the following components:

aov.tab

typical AOV table showing sources of variation, degrees of freedom, sums of squares, mean squares, F statistics, partial R^2 and P values.

df

degree of freedom for the Chisquared distribution.

G

The Gower's matrix if returnG is true.

call

the call made

Author(s)

Jun Chen and Xianyang Zhang

References

Chen, J. & Zhang, X. 2021. D-MANOVA: fast distance-based multivariate analysis of variance for large-scale microbiome association studies. Bioinformatics. https://doi.org/10.1093/bioinformatics/btab498

See Also

adonis3

Examples

## Not run: 
data(throat.otu.tab)
data(throat.tree)
data(throat.meta)

groups <- throat.meta$SmokingStatus

# Rarefaction
otu.tab.rff <- Rarefy(throat.otu.tab)$otu.tab.rff

# Calculate the UniFrac distance
unifracs <- GUniFrac(otu.tab.rff, throat.tree, alpha=c(0, 0.5, 1))$unifracs

# Test the smoking effect based on unweighted UniFrac distance, adjusting sex
# 'Sex' should be put before 'SmokingStatus'
dmanova(as.dist(unifracs[, , 'd_UW']) ~ Sex + SmokingStatus, data = throat.meta)

## End(Not run)


[Package GUniFrac version 1.8 Index]