ssizeRNA_vary {ssizeRNA} | R Documentation |
Sample Size Calculations for Two-Sample RNA-seq Experiments with Differing Mean and Dispersion Among Genes
Description
This function calculates appropriate sample sizes for two-sample RNA-seq experiments for a desired power in which mean and dispersion vary among genes. Sample size calculations are performed at controlled false discovery rates, user-specified proportions of non-differentially expressed genes, mean counts in control group, dispersion, and fold change. A plot of power versus sample size is generated.
Usage
ssizeRNA_vary(nGenes = 10000, pi0 = 0.8, m = 200, mu, disp, fc,
up = 0.5, replace = TRUE, fdr = 0.05, power = 0.8, maxN = 35,
side = "two-sided", cex.title = 1.15, cex.legend = 1)
Arguments
nGenes |
total number of genes, the default value is |
pi0 |
proportion of non-differentially expressed genes,
the default value is |
m |
pseudo sample size for generated data. |
mu |
a vector (or scalar) of mean counts in control group from which to simulate. |
disp |
a vector (or scalar) of dispersion parameter from which to simulate. |
fc |
a vector (or scalar, or a function that takes an integer n and generates a vector of length n) of fold change for differentially expressed (DE) genes. |
up |
proportion of up-regulated genes among all DE genes,
the default value is |
replace |
sample with or without replacement from given parameters. See Details for more information. |
fdr |
the false discovery rate to be controlled. |
power |
the desired power to be achieved. |
maxN |
the maximum sample size used for power calculations. |
side |
options are "two-sided", "upper", or "lower". |
cex.title |
controls size of chart titles. |
cex.legend |
controls size of chart legend. |
Details
If a vector is input for pi0
, sample size calculations
are performed for each proportion.
If the total number of genes is larger than length of mu
or
disp
, replace
always equals TRUE
.
Value
ssize |
sample sizes (for each treatment) at which desired power is first reached. |
power |
power calculations with corresponding sample sizes. |
crit.vals |
critical value calculations with corresponding sample sizes. |
Author(s)
Ran Bi biranpier@gmail.com, Peng Liu pliu@iastate.edu
References
Liu, P. and Hwang, J. T. G. (2007) Quick calculation for sample size while controlling false discovery rate with application to microarray analysis. Bioinformatics 23(6): 739-746.
Orr, M. and Liu, P. (2009) Sample size estimation while controlling false discovery rate for microarray experiments using ssize.fdr package. The R Journal, 1, 1, May 2009, 47-53.
Law, C. W., Chen, Y., Shi, W., Smyth, G. K. (2014). Voom: precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biology 15, R29.
See Also
Examples
library(edgeR)
library(Biobase)
data(hammer.eset)
## load hammer dataset (Hammer, P. et al., 2010)
counts <- exprs(hammer.eset)[, phenoData(hammer.eset)$Time == "2 weeks"]
counts <- counts[rowSums(counts) > 0,]
trt <- hammer.eset$protocol[which(hammer.eset$Time == "2 weeks")]
mu <- apply(counts[, trt == "control"], 1, mean)
## average read count in control group for each gene
d <- DGEList(counts)
d <- calcNormFactors(d)
d <- estimateCommonDisp(d)
d <- estimateTagwiseDisp(d)
disp <- d$tagwise.dispersion
## dispersion for each gene
## fixed fold change
fc <- 2
size <- ssizeRNA_vary(mu = mu, disp = disp, fc = fc,
m = 30, maxN = 15, replace = FALSE)
size$ssize ## first sample size to reach desired power
size$power ## calculated power for each sample size
size$crit.vals ## calculated critical value for each sample size
## varying fold change
## fc1 <- function(x){exp(rnorm(x, log(2), 0.5*log(2)))}
## size1 <- ssizeRNA_vary(pi0 = 0.8, mu = mu, disp = disp, fc = fc1,
## m = 30, maxN = 20, replace = FALSE)