C.alpha.multinomial {HMP} | R Documentation |
C(\alpha)
- Optimal Test for Assessing Multinomial Goodness of Fit Versus Dirichlet-Multinomial Alternative
Description
A function to compute the C(\alpha)
-optimal test statistics of Kim and Margolin (1992)
for evaluating the Goodness-of-Fit of a Multinomial distribution (null hypothesis) versus a Dirichlet-Multinomial
distribution (alternative hypothesis).
Usage
C.alpha.multinomial(data)
Arguments
data |
A matrix of taxonomic counts(columns) for each sample(rows). |
Details
In order to test if a set of ranked-abundance distribution(RAD) from microbiome samples can be modeled better using a multinomial or Dirichlet-Multinomial
distribution, we test the hypothesis \mathrm{H}: \theta = 0
versus \mathrm{H}: \theta \ne 0
,
where the null hypothesis implies a multinomial distribution and the alternative hypothesis implies a DM distribution.
Kim and Margolin (Kim and Margolin, 1992) proposed a C(\alpha)
-optimal test- statistics given by,
T = \sum_{j=1}^{K} \sum_{i=1}^{P} \frac{1}{\sum_{i=1}^{P} x_{ij}}\left (x_{ij}-\frac{N_{i}\sum_{i=1}^{P} x_{ij}}{N_{\mathrm{g}}} \right )^2
Where K
is the number of taxa, P
is the number of samples, x_{ij}
is the taxon j
, j = 1,\ldots,K
from sample i
,
i=1,\ldots,P
, N_{i}
is the number of reads in sample i
, and N_{\mathrm{g}}
is the total number of reads across samples.
As the number of reads increases, the distribution of the T
statistic converges to a Chi-square with degrees of freedom
equal to (P-1)(K-1)
, when the number of sequence reads is the same in all samples. When the number of reads is not the same in all samples,
the distribution becomes a weighted Chi-square with a modified degree of freedom (see (Kim and Margolin, 1992) for more details).
Note: Each taxa in data
should be present in at least 1 sample, a column with all 0's may result in errors and/or invalid results.
Value
A list containing the C(\alpha)
-optimal test statistic and p-value.
References
Kim, B. S., and Margolin, B. H. (1992). Testing Goodness of Fit of a Multinomial Model Against Overdispersed Alternatives. Biometrics 48, 711-719.
Examples
data(saliva)
calpha <- C.alpha.multinomial(saliva)
calpha