R: PERMANOVA test of association based on the Freedman-Lane...

permanovaFL {LDM}

R Documentation

PERMANOVA test of association based on the Freedman-Lane permutation scheme

Description

This function performs the PERMANOVA test that can allow adjustment of confounders and control of clustered data. It can also be used for testing presence-absence associations based on infinite number of rarefaction replicates. As in ldm, permanovaFL allows multiple sets of covariates to be tested, in the way that the sets are entered sequentially and the variance explained by each set is that part that remains after the previous sets have been fit. It allows testing of a survival outcome, by using the Martingale or deviance residual (from fitting a Cox model to the survival outcome and other covariates) as a covariate in the regression. It allows multiple distance matrices and provides an omnibus test in such cases. It also allows testing of the mediation effect of the microbiome in the pathway between the exposure(s) and the outcome(s), where the exposure(s) and outcomes(s) are specified as the first and second (sets of) covariates.

Usage

permanovaFL(
  formula,
  other.surv.resid = NULL,
  data = .GlobalEnv,
  tree = NULL,
  dist.method = c("bray"),
  dist.list = NULL,
  cluster.id = NULL,
  strata = NULL,
  how = NULL,
  perm.within.type = "free",
  perm.between.type = "none",
  perm.within.ncol = 0,
  perm.within.nrow = 0,
  n.perm.max = 5000,
  n.rej.stop = 100,
  seed = NULL,
  square.dist = TRUE,
  center.dist = TRUE,
  scale.otu.table = c(TRUE),
  binary = c(FALSE),
  n.rarefy = 0,
  test.mediation = FALSE,
  n.cores = 4,
  verbose = TRUE
)

Arguments

`formula`	a symbolic description of the model to be fitted in the form of `data.matrix ~ sets of covariates` or `data.matrix \| confounders ~ sets of covariates`. The details of model specification are given in "Details" of `ldm`. Additionally, in `permanovaFL`, the `data.matrix` can be either an OTU table or a distance matrix. If it is an OTU table, the distance matrix will be calculated internally using the OTU table, `tree` (if required), and `dist.method`. If `data.matrix` is a distance matrix (having class `dist` or `matrix`), it can be squared and//or centered by specifying `square.dist` and `center.dist` (described below). Distance matrices are distinguished from OTU tables by checking for symmetry of `as.matrix(data.matrix)`.
`other.surv.resid`	a vector of data, usually the Martingale or deviance residuals from fitting the Cox model to the survival outcome (if it is the outcome of interest) and other covariates.
`data`	an optional data frame, list or environment (or object coercible to a dataframe) containing the covariates of interest and confounding covariates. If not found in `data`, the covariates are taken from environment(formula), typically the environment from which `permanovaFL` is called. The default is .GlobalEnv.
`tree`	a phylogenetic tree. Only used for calculating a phylogenetic-tree-based distance matrix. Not needed if the calculation of the requested distance does not involve a phylogenetic tree, or if a distance matrix is directly imported through `formula`.
`dist.method`	a vector of methods for calculating the distance measure, partial match to all methods supported by `vegdist` in the `vegan` package (i.e., "manhattan", "euclidean", "canberra", "bray", "kulczynski", "jaccard", "gower", "altGower", "morisita", "horn", "mountford", "raup" , "binomial", "chao", "cao", "mahalanobis") as well as "hellinger" and "wt-unifrac". Not used if a distance matrix is specified in `formula` or `dist.list`. The default is c("bray"). For more details, see the `dist.method` argument in the `ldm` function.
`dist.list`	a list of pre-calculated distance matrices.
`cluster.id`	cluster identifiers. The default is value of NULL should be used if the observations are not in clusters (i.e., independent).
`strata`	a factor variable (or, character variable converted into a factor) to define strata (groups), within which to constrain permutations. The default is NULL.
`how`	a permutation control list, for users who want to specify their permutation control list using the `how` function from the `permute` R package. The default is NULL.
`perm.within.type`	a character string that takes values "free", "none", "series", or "grid". The default is "free" (for random permutations).
`perm.between.type`	a character string that takes values "free", "none", or "series". The default is "none".
`perm.within.ncol`	a positive integer, only used if perm.within.type="grid". The default is 0. See the documentation for the R package `permute` for further details.
`perm.within.nrow`	a positive integer, only used if perm.within.type="grid". The default is 0. See the documentation for the R package `permute` for further details.
`n.perm.max`	the maximum number of permutations. The default is 5000.
`n.rej.stop`	the minimum number of rejections (i.e., the permutation statistic exceeds the observed statistic) to obtain before stopping. The default is 100.
`seed`	a user-supplied integer seed for the random number generator in the permutation procedure. The default is NULL; with the default value, an integer seed will be generated internally and randomly. In either case, the integer seed will be stored in the output object in case the user wants to reproduce the permutation replicates.
`square.dist`	a logical variable indicating whether to square the distance matrix. The default is TRUE.
`center.dist`	a logical variable indicating whether to center the distance matrix as described by Gower (1966). The default is TRUE.
`scale.otu.table`	a vector of logical variables indicating whether to scale the OTU table in calculating the distance matrices in `dist.method`. For count data, this corresponds to dividing by the library size to give relative abundances. The default is TRUE.
`binary`	a vector of logical values indicating whether to base the calculation of the distance matrices in `dist.method` on presence-absence (binary) data. The default is c(FALSE) (analyzing relative abundance data).
`n.rarefy`	number of rarefactions. The default is 0 (no rarefaction).
`test.mediation`	a logical value indicating whether to perform the mediation analysis. The default is FALSE. If TRUE, the formula takes the specific form `otu.table ~ exposure + outcome` or most generally `otu.table or distance matrix \| (set of confounders) ~ (set of exposures) + (set of outcomes)`.
`n.cores`	The number of cores to use in parallel computing, i.e., at most how many child processes will be run simultaneously. The default is 4.
`verbose`	a logical value indicating whether to generate verbose output during the permutation process. Default is TRUE.

Value

a list consisting of

`F.statistics`	F statistics for testing each set of covariates
`R.squared`	R-squared statistic for each set of covariates
`F.statistics.OR`, `R.squared.OR`	F statistics and R-squared statistic when the last covariate is `other.surv.resid`
`p.permanova`	p-values for testing each set of covariates
`p.permanova.omni`	the omnibus p-values (that combines information from multiple distance matrices) for testing each set of covariates
`med.p.permanova`	p-values for testing mediation
`med.p.permanova.omni`	the omnibus p-values for testing mediation
`p.permanova.OR`, `p.permanova.omni.OR`	when using `other.surv.resid` as the last covariate
`med.p.permanova.OR`, `med.p.permanova.omni.OR`	when using `other.surv.resid` as the outcome in the mediation analysis
`p.permanova.com`, `p.permanova.omni.com`	the combination test that combines the results from analyzing the Martingale residual and the Deviance residual (one specified in the formula and one specified in `other.surv.resid`)
`med.p.permanova.com`, `med.p.permanova.omni.com`	the combination test for the mediation effect
`n.perm.completed`	number of permutations completed
`permanova.stopped`	a logical value indicating whether the stopping criterion has been met by all tests of covariates
`seed`	the seed that is user supplied or internally generated, stored in case the user wants to reproduce the permutation replicates

Author(s)

Yi-Juan Hu <yijuan.hu@emory.edu>, Glen A. Satten <gsatten@emory.edu>

References

Hu YJ, Satten GA (2020). Testing hypotheses about the microbiome using the linear decomposition model (LDM) Bioinformatics, 36(14), 4106-4115.

Hu YJ and Satten GA (2021). A rarefaction-without-resampling extension of PERMANOVA for testing presence-absence associations in the microbiome. bioRxiv, https://doi.org/10.1101/2021.04.06.438671.

Zhu Z, Satten GA, Caroline M, and Hu YJ (2020). Analyzing matched sets of microbiome data using the LDM and PERMANOVA. Microbiome, 9(133), https://doi.org/10.1186/s40168-021-01034-9.

Hu Y, Li Y, Satten GA, and Hu YJ (2022) Testing microbiome associations with censored survival outcomes at both the community and individual taxon levels. bioRxiv, doi.org/10.1101/2022.03.11.483858.

Examples

res.perm <- permanovaFL(throat.otu.tab5 | (Sex+AntibioticUse) ~ SmokingStatus+PackYears, 
                       data=throat.meta, dist.method="bray", seed=82955, n.perm.max=1000, n.cores=1, 
                       verbose=FALSE)

[Package LDM version 6.0.1 Index]