puissance {SARP.compo} | R Documentation |
Estimate the power and the type-I error of the disjoint-subgraphs method
Description
Estimate the power and the type-I error of the disjoint-graph method to detect a change in compositions between different conditions
Usage
estimer.puissance( composition, cv.composition,
taille.groupes = 10, masque,
f.p, v.X = 'Condition',
seuil.candidats = ( 5:30 ) / 100,
f.correct = groupes.identiques,
groupes.attendus = composition$Graphes[[ 1 ]]$Connexe,
avec.classique = length( attr( composition, "reference" ) ) > 0,
f.correct.classique = genes.trouves,
genes.attendus,
B = 3000, n.coeurs = 1,
... )
estimer.alpha( composition, cv.composition,
taille.groupes = 10, masque,
f.p, v.X = 'Condition',
seuil.candidats = ( 5:30 ) / 100,
avec.classique = length( attr( composition, "reference" ) ) > 0,
B = 3000, n.coeurs = 1,
... )
Arguments
composition |
A composition model, as obtained by
|
cv.composition |
The expected coefficient of variation of the
quantified amounts. Should be either a single value, that will be
used for all components and all conditions, or a matrix with the
same structure than |
.
taille.groupes |
The sample size for each condition. Unused if
|
masque |
A data.frame that will give the dataset design for a
given experiment. Should contain at least one column containing the
names of the conditions, with values being in the conditions names
in |
f.p |
The function used to analyse the dataset. See
|
v.X |
The name of the column identifying the different conditions
in |
seuil.candidats |
A vector of p-value cut-offs to be tested. All values should be between 0 and 1. |
f.correct |
A function to determine if the result of the analysis is the expected one. Defaults to a function that compares the disjoint sub-graphs of a reference graph and the obtained one. |
groupes.attendus |
The reference graph for the above function. Defaults to the theoretical graph of the model, for the comparison between the first and the second conditions. |
avec.classique |
If If requested, the analysis is done with and without multiple testing correction (with Holm's method). The “cut-off p-value” is used as the nominal type~I error level for the individual tests. |
f.correct.classique |
A function to determine if the alr-like method finds the correct answer. Defaults to a function that compares the set of significant tests with the set of expected components. |
genes.attendus |
A character vector giving the names of components expected to behave differently than the reference set. |
B |
The number of simulations to be done. |
n.coeurs |
The number of CPU cores to use in computation, with parallelization using forks (does not work on Windows) with the help of the parallel package. |
... |
Additionnal parameters for helper functions, including |
Details
Use this function to simulate experiments and explore the properties of the disjoint graph method in a specified experimental context. Simulations are done using a log-normal model, so analysis is always done on the log scale. Coefficients of variation in the original scale hence directly translate into standard deviations in the log-scale.
For power analysis, care should be taken that any rejection of the null hypothesis “nothing is different between conditions” is counted as a success, even if the result does not respect the original changes. This is the reason for the additional correct-finding probability estimation. However, defining what is a correct, or at least acceptable, result may be not straightforward, especially for comparison with other analysis methods.
Note also that fair power comparisons can be done only for the same type I error level. Hence, for instance, power of the corrected alr-like method at p = 0.05 should be compared to the power of the disjoint-graph method at its “optimal” cut-off.
Value
An object of class SARPcompo.simulation
, with a plot
method. It is a data.frame with the following columns:
Seuil |
The cut-offs used to build the graph |
Disjoint |
The number of simulations that led to disjoint graphs. |
Correct |
The number of simulations that led to the correct graph
(as defined by the |
If avec.classique
is TRUE
, it has additionnal columns:
DDCt |
The number of simulations that led at least one significant test using the alr-like method. |
DDCt.H |
The number of simulations that led at least one significant test using the alr-like method, after multiple testing correction using Holm's method. |
DDCt.correct |
The number of simulations that detected the
correct components (as defined by the |
DDCt.H.correct |
As above, but after multiple testing correction using Holm's method. |
It also stores a few informations as attributes, including the total
number of simulations (attribute n.simulations
).
Author(s)
Emmanuel Curis (emmanuel.curis@parisdescartes.fr)
See Also
modele_compo
to create a compositional model for two or
more conditions.
creer.Mp
, which is used internally, for details about
analysis functions.
choisir.seuil
for a simpler interface to estimate the
optimal cut-off.
Examples
## Create a toy example: four components, two conditions
## components 1 and 2 do not change between conditions
## components 3 and 4 are doubled
## component 1 is a reference component
me <- rbind( 'A' = c( 1, 1, 1, 1 ),
'B' = c( 1, 1, 2, 2 ) )
colnames( me ) <- paste0( "C-", 1:4 )
md <- modele_compo( me, reference = 'C-1' )
## How many simulations?
## 50 is for speed; increase for useful results...
B <- 50
## What is the optimal cut-off for this situation?
## (only a few simulations for speed, should be increased)
## (B = 3000 suggests a cut-off between 0.104 and 0.122)
seuil <- choisir.seuil( 4, B = B )
## What is approximately the type I error
## between conditions A and B using a Student test
## with a CV of around 50 % ?
## (only a few simulations for speed, should be increased)
alpha <- estimer.alpha( md, cv = 0.50, B = B,
f.p = student.fpc )
# Plot it : darkgreen = the disjoint graph method
# orange = the alr-like method, Holm's corrected
# salmon = the alr-like method, uncorrected
plot( alpha )
## What is approximately the power to detect that something changes
## between conditions A and B using a Student test
## with a CV of around 50 % ?
## (only a few simulations for speed, should be increased)
puissance <- estimer.puissance( md, cv = 0.50, B = B,
f.p = student.fpc,
genes.attendus = c( 'C-3', 'C-4' ) )
# Plot it : darkgreen = the disjoint graph method
# orange = the alr-like method, Holm's corrected
# salmon = the alr-like method, uncorrected
plot( puissance )
## Do we detect the correct situation in general?
## (that is, exactly two sets: one with C-1 and C-2, the second with
## C-3 and C-4 --- for the alr-like method, that only C-3 and C-4
## are significant)
# darkgreen = the disjoint graph method
# orange = the alr-like method, Holm's corrected
# salmon = the alr-like method, uncorrected
plot( puissance, correct = TRUE )