simdata {PWEXP}R Documentation

Simulate Survival Data


simdata is used to simulate a clinical trial data with time-to-event endpoints.


simdata(group="Group 1", strata="Strata 1", allocation=1,
    event_lambda=NA, drop_rate=NA, death_lambda=NA, n_rand=NULL,
    rand_rate=NULL, total_sample=NULL, add_column=c('followT'),
    simplify=TRUE, advanced_dist=NULL)



a character vector of the names of each group (e.g., c('treatment','control')).


a character vector of the names of strata in groups (e.g., c('young','old')).


the relative ratio of sample size in each subgroup (group*strata). See details. The value will be recycled if the length is less than needed.


the hazard rate of the primary endpoint (event). See details. The value will be recycled if the length is less than needed.


(optional) the drop-out rate (patients/month). Not hazard rate. See details. The value will be recycled if the length is less than needed.


(optional) the hazard rate of death. The value will be recycled if the length is less than needed.


(required when rand_rate=NULL) a vector of the number of randomization each month; can be non-integers.


(required when n_rand=NULL) the randomization rate (patients/month; can be non-integer).


(required when n_rand=NULL) total scheduled sample size.


request additional columns of the returned data frame.
Valid options are:

  • 'eventT_abs': absolute event time from the beginning of the trial (=eventT+randT)

  • 'dropT_abs': absolute drop-out time from the beginning of the trial (=dropT+randT)

  • 'deathT_abs': absolute death time from the beginning of the trial (=deathT+randT)

  • 'censor': censoring (drop-out or death) indicator

  • 'event': event indicator

  • 'censor_reason': censoring reason ('drop_out','death','never_event'(eventT=inf))

  • 'followT': follow-up time (true observed time) from randT

  • 'followT_abs': absolute follow-up time from the beginning of the trial (=followT+randT)


whether drop unused columns (e.g., the group variable when there is only one group). See details.


use user-specified distributions for event, drop-out and death. A list containing random generation functions. See details and examples.


See webpage for a diagram illustration of the relationship between returned variables.

The total number of subgroups will be '# treatment groups' * '# strata'. The strata variable will be distributed into each treatment group. For example, if group = c('trt','placebo'), strata=c('A','B','C'), then there will be 6 subgroups: trt+A, trt+B, trt+C, placebo+A, placebo+B, placebo+C. The lengths of allocation, event_lambda, drop_rate, death_lambda should be 6 as well. Note that the values will be recycled for these variables. For example, if allocation=c(1,2,3), then the proportion of 6 subgroups is actually 1:2:3:1:2:3, which means 1:1 ratio for groups, 1:2:3 ratio in each stratum.

The event_lambda (λ\lambda) is the hazard rate of the interested events. The density function of events is f(t)=λeλtf(t)=\lambda e^{-\lambda*t}. Similarly, the death_lambda is the hazard rate of death.

The drop_rate is the probability of drop-out at t=1t=1, which means the hazard rate of drop-out is log(1droprate)-log(1-drop_rate) (or say, drop_rate=1ehazardrate1-e^{-hazard rate}.

When simplify=TRUE, these columns will NOT be included:

advanced_dist is used to define non-exponential distributions for event, drop-out or death. It is a list containing at least one of the elements: event_dist, drop_dist, death_dist. Each element has random generation functions for each subgroups. For example, advanced_dist=list(event_dist=c(function1, function2), drop_dist=c(function3, function4)). Here function1, function3 are the event, drop-out generation function for the first subgroup; function2, function4 for the second. If there is a third subgroup, function1, function3 will be reused. Each data generation function (functionX) is a function with only one input argument n (sample size). If any of the event_dist, drop_dist, death_dist is missing, then we search for event_lambda, drop_rate, death_lambda to generate a exp distribution; if they are also missing, then corresponding variable will not be generated .


A data frame containing the some of these columns:


subject ID


group indicator


stratum indicator


randomization time (from the beginning of the trial)


event time (from randT)


event time (from the beginning of the trial)


drop-out time (from randT)


drop-out time (from the beginning of the trial)


death time (from randT)


death time (from the beginning of the trial)


censoring (drop-out or death) indicator


censoring reason ('drop_out','death','never_event'(followT=inf))


event indicator


follow-up time / observed time (from randT)


follow-up time / observed time (from the beginning of the trial)


event_lambda, drop_rate, death_lambda can be 0, which means the corresponding subgroup will have an Inf value for each variable.


Tianchen Xu

See Also

rpwexp, rpwexp_conditional


# Two groups with two strata. In the treatment group, there is a treatment
# sensitive stratum and a non-sensitive stratum. In the placebo group, all
# subjects are the same. Treatment:place=1:2. Drop rate=1% only in treatment group.
dat <- simdata(group=c('trt', 'place'), strata = c('sensitive','non-sensitive'),
               allocation = c(1,1,2,2), rand_rate = 20, total_sample = 1000,
               event_lambda = c(0.1, 0.2, 0.01, 0.01),
               drop_rate = c(0.01, 0.01, 0, 0))
# randomized subjects
# randomization curve
plot(sort(dat$randT), 1:1000, xlab='time', ylab='randomized subjects')
# event time in treatment group
plot(ecdf(dat$eventT[dat$group=='trt' & dat$strata=='sensitive']))
lines(ecdf(dat$eventT[dat$group=='trt' & dat$strata=='non-sensitive']), col='red')

# One group. Event follows a piecewise exponential distribution; drop-out follows
# a Weibull; death follows a exponential.
dist_trt <- function(n)rpwexp(n, rate=c(0.01, 0.05, 0.01), breakpoint = c(30,60))
dist_placebo <- function(n)rpwexp(n, rate=c(0.01, 0.005), breakpoint = c(50))
dat <- simdata(group = c('trt','placebo'), n_rand = c(rep(10,50),rep(20,10)),
               death_lambda = 0.01,
               advanced_dist = list(event_dist=c(dist_trt, dist_placebo),
# randomized subjects
# randomization curve
plot(sort(dat$randT), 1:700, xlab='time', ylab='randomized subjects')
# event time in both groups
plot(ecdf(dat$eventT[dat$group=='trt']), xlim=c(0,100))
lines(ecdf(dat$eventT[dat$group=='placebo']), col='red')
# drop-out time
plot(ecdf(dat$dropT), xlim=c(0,100))

# mixture cure distribution, 20% of the subject are cured and will not have events
dat <- simdata(strata=c('cure','non-cure'), allocation=c(20,80),
        event_lambda=c(0, 0.38), n_rand = rep(20,30),
        add_column = c('eventT_abs', 'censor', 'event',
                       'censor_reason', 'followT', 'followT_abs'))

[Package PWEXP version 0.5.0 Index]