ZIGDM {miLineage} | R Documentation |
Zero-Inflated Generalized Dirichlet Multinomial (ZIGDM) Tests
Description
Different from the distribution-free QCAT and QCAT+GEE, the ZIGDM tests are based on a parametric model (ZIGDM) for multivariate taxon counts. The ZIGDM tests can not only detect differential mean but also differential dispersion level or presence-absence frequency in microbial compositions. This function allows users to (a) perform ZIGDM tests on multivariate taxon counts; (b) perform ZIGDM tests on the taxonomic tree to localize the covariate-associated lineages; and (c) assess the overall association of the microbial community with the covariate of interest.
Usage
ZIGDM(OTU, X4freq, X4mean, X4disp, test.type = "Mean", X.index, ZI.LB = 10, Tax = NULL,
min.depth = 0, n.resample = NULL, fdr.alpha = 0.05)
Arguments
OTU |
a numeric matrix contains counts with each row corresponds to a sample and each column corresponds to an OTU or a taxon. Column name is mandatory. No missing values are allowed. |
X4freq |
a numeric matrix contains covariates that link to presence-absence frequency in microbial compositions. Each column pertains to one variable. Set X4freq=NULL if only intercept term is needed. Samples in the OTU and X4freq matrices should be identical and in the same order. No missing values are allowed. |
X4mean |
a numeric matrix contains covariates that link to mean abundance in microbial compositions. Each column pertains to one variable. Set X4mean=NULL if only intercept term is needed. Samples in the OTU and X4mean matrices should be identical and in the same order. No missing values are allowed. |
X4disp |
a numeric matrix contains covariates that link to dispersion level in microbial compositions. Each column pertains to one variable. Set X4disp=NULL if only intercept term is needed. Samples in the OTU and X4disp matrices should be identical and in the same order. No missing values are allowed. |
test.type |
If test.type = "Mean", the function will test for differential mean (Default). If test.type = "Disp", the function will test for differential dispersion. If test.type = "Freq", the function will test for differential presence-absence frequency. |
X.index |
If test.type = "Mean", X.index is a vector indicates the columns in X4mean for the covariate(s) of interest. The remaining columns in X4mean will be treated as confounders in modeling the abundance of the present taxa. If test.type = "Disp", X.index is a vector indicates the columns in X4disp for the covariate(s) of interest. The remaining columns in X4disp will be treated as confounders in modeling the dispersion level of the present taxa. If test.type = "Freq", X.index is a vector indicates the columns in X4freq for the covariate(s) of interest. The remaining columns in X4freq will be treated as confounders in modeling the presence-absence frequency. |
ZI.LB |
lower bound of zero counts for the taxon that needs zero-inflated model. The counts for a taxon is assumed to be zero-inflated if the number of zero observations for the taxon is greater than ZI.LB (Default is 10). If ZI.LB=NULL, GDM model (i.e. non-zero-inflated version of ZIGDM) will be applied. |
Tax |
a matrix define the taxonomic ranks with each row corresponds to an OTU or a taxon and each column corresponds to a rank (start from the higher taxonomic level, e.g., from kingdom to genus). Row name is mandatory and should be consistent with the column name of the OTU table, Column name should be formatted as "Rank1", "Rank2", etc, representing the taxonomic levels from highest to lowest. |
min.depth |
lower bound of sample read depth. Samples with read depth less than min.depth will be removed before the analysis. |
n.resample |
number of reamplings/permutations |
fdr.alpha |
desired false discovery rate for multiple tests on the lineages. |
Value
If Tax=NULL (Default), a test is performed using all the OTUs/taxa.
If Tax is provided, tests are performed for lineages derived from the taxonomic hierarchy. The output is a list that contains 3 components
lineage.pval |
p-values for all lineages. By default (n.resample=NULL), only the asymptotic test will be performed. |
sig.lineage |
a vector of significant lineages |
global.pval |
p-values of the global tests |
Author(s)
Zheng-Zheng Tang
References
Tang, Z.-Z. and Chen, G. (2017). Zero-Inflated Generalized Dirichlet Multinomial Regression Model for Microbiome Compositional Data Analysis. Submitted.
Examples
data(data.toy)
OTU.toy = data.toy$OTU.toy
Tax.toy = data.toy$Tax.toy
case = data.toy$covariate.toy
# the OTUs should be consistent between the OTU table and the taxonomy table
OTU.toy.reorder = OTU.toy[,match(rownames(Tax.toy), colnames(OTU.toy))]
# perform ZIGDM test for detecting differential dispersion
ZIGDM(OTU.toy.reorder, NULL, NULL, case, test.type = "Disp", 1,
ZI.LB = 10, Tax.toy, fdr.alpha = 0.05)