sumStats {sumSome}R Documentation

True Discovery Guarantee for Generic Statistics

Description

This function determines confidence bounds for the number of true discoveries, the true discovery proportion and the false discovery proportion within a set of interest. The bounds are simultaneous over all sets, and remain valid under post-hoc selection.

Usage

sumStats(G, S = NULL, alternative = "greater", alpha = 0.05,
         truncFrom = NULL, truncTo = NULL, nMax = 50)

Arguments

G

numeric matrix of statistics, where columns correspond to variables, and rows to data transformations (e.g. permutations). The first transformation is the identity.

S

vector of indices for the variables of interest (if not specified, all variables).

alternative

direction of the alternative hypothesis (greater, lower, two.sided).

alpha

significance level.

truncFrom

truncation parameter: values less extreme than truncFrom are truncated. If NULL, statistics are not truncated.

truncTo

truncation parameter: truncated values are set to truncTo. If NULL, statistics are not truncated.

nMax

maximum number of iterations.

Details

Truncation parameters should be such that truncTo is not more extreme than truncFrom.

The significance level alpha should be in the interval [1/B, 1), where B is the number of data transformations (rows in G).

Value

sumStats returns an object of class sumObj, containing

Author(s)

Anna Vesely.

References

Goeman, J. J. and Solari, A. (2011). Multiple testing for exploratory research. Statistical Science, 26(4):584-597.

Hemerik, J. and Goeman, J. J. (2018). False discovery proportion estimation by permutations: confidence for significance analysis of microarrays. JRSS B, 80(1):137-155.

Vesely, A., Finos, L., and Goeman, J. J. (2020). Permutation-based true discovery guarantee by sum tests. Pre-print arXiv:2102.11759.

See Also

True discovery guarantee using p-values: sumPvals

Access a sumObj object: discoveries, tdp, fdp

Examples

# generate matrix of t-scores for 5 variables and 10 permutations
G <- simData(prop = 0.6, m = 5, B = 10, alpha = 0.4, p = FALSE, seed = 42)
 
# subset of interest (variables 1 and 2)
S <- c(1,2)
 
# create object of class sumObj
res <- sumStats(G, S, alpha = 0.4, truncFrom = 0.7, truncTo = 0)
res
summary(res)

# lower confidence bound for the number of true discoveries in S
discoveries(res)

# lower confidence bound for the true discovery proportion in S
tdp(res)

# upper confidence bound for the false discovery proportion in S
fdp(res)

[Package sumSome version 1.1.0 Index]