tarq {iGasso} | R Documentation |
An Accurate Normalization Method for High Throughput Sequencing Data
Description
Estimates scaling factors using the trimmed average of ratios of quantiles (TARQ) method
Usage
tarq(X, tau = 0.3)
Arguments
X |
a matrix of raw counts. Rows are for taxa (genes, transcripts) and columns for samples |
tau |
a numerical value in (0, 0.5). The upper |
Details
Estimation of scaling factors for NGS read counts data is challenging. TARQ provides a quantile-based method for estimating scaling factors. It starts by ordering the raw counts sample by sample and constructs a reference sample from these ordered counts. To compute the scaling factor for a sample, ratios of its quantiles to those of the reference sample are formed. Zero ratios are removed. Then extreme ratios (too large or too small) are trimmed before taking average over the remaining ratios.
Value
a vector of scaling factors. Normalized counts can be obtained by sweep(X, 2, scale.factors, FUN="/")
Author(s)
Kai Wang <kai-wang@uiowa.edu>
References
Wang, K. (2018) An Accurate Normalization Method for Next-Generation Sequencing Data. Submitted.
Examples
#data(throat.otu.tab)
#data(throat.meta)
#otu.tab = t(throat.otu.tab)
#tarq(otu.tab, 0.3)
##### Use TARQ with DESeq2
#dds <- DESeqDataSetFromMatrix(countData = otu.tab,
# colData = throat.meta,
# design= ~ SmokingStatus)
#sizeFactors(dds) <- tarq(otu.tab, 0.3)
#dds <- DESeq(dds)
#results(dds)
#
###### Use TARQ with edgeR
#cs <- colSums(otu.tab)
#scale.factors <- tarq(otu.tab, 0.3)
#tmp <- scale.factors/cs
#norm.factors <- tmp/exp(mean(log(tmp)))
#dgList <- DGEList(counts = otu.tab, genes=rownames(otu.tab), norm.factors = norm.factors)
#designMat <- model.matrix(~ throat.meta$SmokingStatus)
#dgList <- estimateGLMCommonDisp(dgList, design=designMat)
#fit <- glmFit(dgList, designMat)
#glmLRT(fit, coef=2)