GMPR {GUniFrac} | R Documentation |
Geometric Mean of Pairwise Ratios (GMPR) Normalization for Zero-inflated Count Data
Description
A robust normalization method for zero-inflated count data such as microbiome sequencing data.
Usage
GMPR(OTUmatrix, min_ct = 2, intersect_no = 4)
Arguments
OTUmatrix |
An OTU count table, where OTUs are arranged in rows and samples in columns. |
min_ct |
The minimal number of OTU counts. Only those OTU pairs with at least |
intersect_no |
The minimal number of shared OTUs between samples. Only those sample pairs sharing at least |
Details
Normalization is a critical step in microbiome sequencing data analysis to account for variable library sizes. Microbiome data contains a vast number of zeros, which makes the traditional RNA-Seq normalization methods unstable. The proposed GMPR normalization remedies this problem by switching the two steps in DESeq2 normalization:
First, to calculate rij, the median count ratio of nonzero counts between samples: rij=median(cki/ckj) (k in 1:OTU_number and cki, ckj is the non-zero count of the kth OTU)
Second, to calculate the size factor si for a given sample i: si=geometric_mean(rij)
Value
A vector of GMPR size factor for each sample.
Author(s)
Jun Chen and Lujun Zhang
References
Li Chen, James Reeve, Lujun Zhang, Shenbing Huang, and Jun Chen. 2018. GMPR: A robust normalization method for zero-inflated count data with application to microbiome sequencing data. PeerJ, 6, e4600.
Examples
data(throat.otu.tab)
size.factor <- GMPR(t(throat.otu.tab))