ssize.twoSamp {ssize.fdr} | R Documentation |
Sample Size Calculations for Two-Sample Microarray Experiments
Description
Calculates appropriate sample sizes for two-sample microarray experiments for a desired power. Sample size calculations are performed at controlled false discovery rates, user-specified proportions of non-differentially expressed genes, effect size, and standard deviation. A graph of power versus sample size is created.
Usage
ssize.twoSamp(delta, sigma, fdr = 0.05, power = 0.8, pi0 = 0.95, maxN = 35,
side = "two-sided", cex.title=1.15, cex.legend=1)
Arguments
delta |
the common difference in mean expressions between the two samples for all genes |
sigma |
the common standard deviation of expressions for all genes |
fdr |
the false discovery rate to be controlled |
power |
the desired power to be achieved |
pi0 |
a vector (or scalar) of proportions of non-differentially expressed genes |
maxN |
the maximum sample size used for power calculations |
side |
options are "two-sided", "upper", or "lower" |
cex.title |
controls size of chart titles |
cex.legend |
controls size of chart legend |
Details
The true difference between mean expressions of the two samples
as well as the standard deviations of expressions are assumed
identical for all genes. See the function
ssize.twoSampVary
for sample size calculations
with varying differences between sample mean expressions and
standard deviations among genes.
If a vector is input for pi0
, sample size calculations
are performed for each proportion.
Value
ssize |
sample sizes (for each treatment) at which desired power is first reached |
power |
power calculations with corresponding sample sizes |
crit.vals |
critical value calculations of two-sample t-test with corresponding sample sizes |
Note
Powers calculated to be 0 may be negligibly conservative.
Critical values calculated as ‘NA’ are values >20.
Running this function with the side
option of "lower" will
possibly result in multiple warnings. Calculating the probability
that an observation is less than the negative critical value under
a t-distribution with non-centrality parameter delta/sigma
(see argument section above) and the appropriate degrees of freedom
is a calculation that is performed many times while the function
runs. When the difference between the critical value and
delta/sigma is large, this probability is virtually zero.
This happens repeatedly while the function optimize
finds the appropriate critical value for each sample size. Because
of this, the function pt
outputs a value <1e-8 in
addition to a warning of “full precision not achieved”. This has no
impact on the accuracy of the resulting calculations of sample size.
Author(s)
Megan Orr megan.orr@ndsu.edu, Peng Liu pliu@iastate.edu
References
Liu, Peng and J. T. Gene Hwang. 2007. Quick calculation for sample size while controlling false discovery rate with application to microarray analysis. Bioinformatics 23(6): 739-746.
See Also
ssize.twoSampVary
, ssize.oneSamp
,
ssize.oneSampVary
, ssize.F
,
ssize.Fvary
Examples
##See Figure 1.(a) of Liu & Hwang (2007)
d<-1 ##difference in differentially expressed genes to be detected
s<-0.5 ##standard deviation
a<-0.05 ##false discovery rate to be controlled
pwr<-0.8 ##desired power
p0<-c(0.5,0.9,0.95) ##proportions of non-differentially expressed genes
N<-20 ##maximum sample size for calculations
ts<-ssize.twoSamp(delta=d,sigma=s,fdr=a,power=pwr,pi0=p0,maxN=N,side="two-sided")
ts$ssize ##first sample sizes to reach desired power for each proportion of
##non-differentially expressed genes
ts$power ##calculated power for each sample size
ts$crit.vals ##calculated critical value for each sample size