R: Power of the tail-rank test

tailRankPower {TailRank}

R Documentation

Power of the tail-rank test

Description

Compute the significance level and the power of a tail-rank test.

Usage

tailRankPower(G, N1, N2, psi, phi, conf = 0.95,
              model=c("bb", "betabinom", "binomial"))
tailRankCutoff(G, N1, N2, psi, conf,
               model=c("bb", "betabinom", "binomial"),
               method=c('approx', 'exact'))

Arguments

`G`	An integer; the number of genes being assessed as potnetial biomarkers. Statistically, the number of hypotheses being tested.
`N1`	An integer; the number of "train" or "healthy" samples used.
`N2`	An integer; the number of "test" or "cancer" samples used.
`psi`	A real number between 0 and 1; the desired specificity of the test.
`phi`	A real number between 0 and 1; the sensitivity that one would like to be able to detect, conditional on the specificity.
`conf`	A real number between 0 and 1; the confidence level of the results. Can be obtained by subtracting the family-wise Type I error from 1.
`model`	A character string that determines whether significance and power are computed based on a binomial or a beta-binomial (bb) model.
`method`	A character string; either "exact" or "approx". The deafult is to use a Bonferroni approximation.

Details

A power estimate for the tail-rank test can be obtained as follows. First, let X ~ Binom(N,p) denote a binomial random variable. Under the null hypotheis that cancer is not different from normal, we let p = 1 - \psi be the expected proportion of successes in a test of whether the value exceeds the psi-th quantile. Now let

\alpha = P(X > x,| N, p)

be one such binomial measurement. When we make G independent binomial measurements, we take

conf = P(all\ G\ of\ the\ X's \le x | N, p).

(In our paper on the tail-rank statistic, we write everything in terms of \gamma = 1 - conf.) Then we have

conf = P(X \le x | N, p)^G = (1 - alpha)^G.

Using a Bonferroni-like approximation, we can take

conf ~= 1 - \alpha*G.

Solving for \alpha, we find that

\alpha ~= (1-conf)/G.

So, the cutoff that ensures that in multiple experiments, each looking at G genes in N samples, we have confidence level conf (or significance level \gamma = 1 - conf) of no false positives is computed by the function tailRankCutoff.

The final point to note is that the quantiles are also defined in terms of q = 1 - \alpha, so there are lots of disfiguring "1's" in the implementation.

Now we set M to be the significance cutoff using the procedure detailed above. A gene with sensitivity \phi gets detected if the observed number of cases above the threshold is greater than or equal to M. The tailRankPower function implements formula (1.3) of our paper on the tail-rank test.

Value

tailRankCutoff returns an integer that is the maximum expected value of the tail rank statistic under the null hypothesis.

tailRankPower returns a real numbe between 0 and 1 that is the power of the tail-rank test to detect a marker with true sensitivity equal to phi.

Author(s)

Kevin R. Coombes <krc@silicovore.com>

Examples

psi.0 <- 0.99
confide <- rev(c(0.8, 0.95, 0.99))
nh <- 20
ng <- c(100, 1000, 10000, 100000)
ns <- c(10, 20, 50, 100, 250, 500)
formal.cut <- array(0, c(length(ns), length(ng), length(confide)))
for (i in 1:length(ng)) {
  for (j in 1:length(ns)) {
    formal.cut[j, i, ] <- tailRankCutoff(ng[i], nh, ns[j], psi.0, confide)
  }
}
dimnames(formal.cut) <- list(ns, ng, confide)
formal.cut

phi <- seq(0.1, 0.7, by=0.1)
N <- c(10, 20, 50, 100, 250, 500)
pows <- matrix(0, ncol=length(phi), nrow=length(N))
for (ph in 1:length(phi)) {
  pows[, ph] <-  tailRankPower(10000, nh, N, 0.95, phi[ph], 0.9)
}
pows <- data.frame(pows)
dimnames(pows) <- list(as.character(N), as.character(round(100*phi)))
pows

[Package TailRank version 3.2.2 Index]