R: Performing simulation studies with BuyseTest

powerBuyseTest {BuyseTest}

R Documentation

Performing simulation studies with BuyseTest

Description

Performs a simulation studies for several sample sizes. Returns estimates, their standard deviation, the average estimated standard error, and the rejection rate. Can also be use for power calculation or to approximate the sample size needed to reach a specific power.

Usage

powerBuyseTest(
  sim,
  sample.size,
  n.rep = c(1000, 10),
  null = c(netBenefit = 0),
  cpus = 1,
  export.cpus = NULL,
  seed = NULL,
  conf.level = NULL,
  power = NULL,
  max.sample.size = 2000,
  alternative = NULL,
  order.Hprojection = NULL,
  transformation = NULL,
  trace = 1,
  ...
)

Arguments

`sim`	[function] take two arguments: the sample size in the control group (`n.C`) and the sample size in the treatment group (`n.C`) and generate datasets. The datasets must be data.frame objects or inherits from data.frame.
`sample.size`	[integer vector or matrix, >0] the group specific sample sizes relative to which the simulations should be perform. When a vector, the same sample size is used for each group. Alternatively can be a matrix with two columns, one for each group (respectively T and C).
`n.rep`	[integer, >0] the number of simulations. When specifying the power instead of the sample size, should be a vector of length 2 where the second element indicates the number of simulations used to identify the sample size.
`null`	[numeric vector] For each statistic of interest, the null hypothesis to be tested. The vector should be named with the names of the statistics.
`cpus`	[integer, >0] the number of CPU to use. Default value is 1.
`export.cpus`	[character vector] name of the variables to export to each cluster.
`seed`	[integer, >0] Random number generator (RNG) state used when starting the simulation study. If `NULL` no state is set.
`conf.level`	[numeric, 0-1] type 1 error level. Default value read from `BuyseTest.options()`.
`power`	[numeric, 0-1] type 2 error level used to determine the sample size. Only relevant when `sample.size` is not given. See details.
`max.sample.size`	[interger, 0-1] sample size used to approximate the sample size achieving the requested type 1 and type 2 error (see details). Can have length 2 to indicate the sample in each group (respectively T and C) when the groups have unequal sample size.
`alternative`	[character] the type of alternative hypothesis: `"two.sided"`, `"greater"`, or `"less"`. Default value read from `BuyseTest.options()`.
`order.Hprojection`	[integer 1,2] the order of the H-project to be used to compute the variance of the net benefit/win ratio. Default value read from `BuyseTest.options()`.
`transformation`	[logical] should the CI be computed on the logit scale / log scale for the net benefit / win ratio and backtransformed. Otherwise they are computed without any transformation. Default value read from `BuyseTest.options()`.
`trace`	[integer] should the execution of the function be traced?
`...`	other arguments (e.g. `scoring.rule`, `method.inference`) to be passed to `initializeArgs`.

Details

Sample size calculation: to approximate the sample size achieving the requested type 1 (\alpha) and type 2 error (\beta), GPC are applied on a large sample (as defined by the argument max.sample.size): N^*=m^*+n^* where m^* is the sample size in the control group and n^* is the sample size in the active group. Then the effect (\delta) and the asymptotic variance of the estimator (\sigma^2) are estimated. The total sample size is then deduced as (two-sided case):

\hat{N} = \hat{\sigma}^2\frac{(u_{1-\alpha/2}+u_{1-\beta})^2}{\hat{\delta}^2}

from which the group specific sample sizes are deduced: \hat{m}=\hat{N}\frac{m^*}{N^*} and \hat{n}=\hat{N}\frac{n^*}{N^*}. Here u_x denotes the x-quantile of the normal distribution.
This approximation can be improved by increasing the sample size (argument max.sample.size) and/or by performing it multiple times based on a different dataset and average estimated sample size per group (second element of argument n.rep).
To evaluate the approximation, a simulation study is then performed with the estimated sample size. It will not exactly match the requested power but should provide a reasonnable guess which can be refined with further simulation studies. The larger the sample size (and/or number of CPUs) the more accurate the approximation.

seed: the seed is used to generate one seed per simulation. These simulation seeds are the same whether one or several CPUs are used.

Value

An S4 object of class S4BuysePower.

Author(s)

Brice Ozenne

Examples

library(data.table)

#### Using simBuyseTest ####
## save time by not generating TTE outcomes
simBuyseTest2 <- function(...){simBuyseTest(..., argsCont = NULL, argsTTE = NULL)}

## only point estimate
## Not run: 
pBT <- powerBuyseTest(sim = simBuyseTest2, sample.size = c(10, 25, 50, 75, 100), 
                  formula = treatment ~ bin(toxicity), seed = 10, n.rep = 1000,
                  method.inference = "none", keep.pairScore = FALSE, cpus = 5)
summary(pBT)
model.tables(pBT)

## End(Not run)

## point estimate with rejection rate

## Not run: 
powerBuyseTest(sim = simBuyseTest2, sample.size = c(10, 50, 100), 
               formula = treatment ~ bin(toxicity), seed = 10, n.rep = 1000,
               method.inference = "u-statistic", trace = 4)

## End(Not run)

#### Using user defined simulation function ####
## power calculation for Wilcoxon test
simFCT <- function(n.C, n.T){
    out <- rbind(cbind(Y=stats::rt(n.C, df = 5), group=0),
                 cbind(Y=stats::rt(n.T, df = 5), group=1) + 1)
    return(data.table::as.data.table(out))
}
simFCT2 <- function(n.C, n.T){
    out <- rbind(cbind(Y=stats::rt(n.C, df = 5), group=0),
                 cbind(Y=stats::rt(n.T, df = 5), group=1) + 0.25)
    return(data.table::as.data.table(out))
}


## Not run: 
powerW <- powerBuyseTest(sim = simFCT, sample.size = c(5,10,20,30,50,100),
                         n.rep = 1000, formula = group ~ cont(Y), cpus = "all")
summary(powerW)

## End(Not run) 

## sample size needed to reach (approximately) a power
## based on summary statistics obtained on a large sample 
## Not run: 
sampleW <- powerBuyseTest(sim = simFCT, power = 0.8, formula = group ~ cont(Y), 
                         n.rep = c(1000,10), max.sample.size = 2000, cpus = 5,
                         seed = 10)
nobs(sampleW)
summary(sampleW) ## not very accurate but gives an order of magnitude

sampleW2 <- powerBuyseTest(sim = simFCT2, power = 0.8, formula = group ~ cont(Y), 
                         n.rep = c(1000,10), max.sample.size = 2000, cpus = 5,
                         seed = 10)
summary(sampleW2) ## more accurate when the sample size needed is not too small

## End(Not run)

[Package BuyseTest version 3.0.4 Index]