simdata {biospear} | R Documentation |
Generation of data sets with survival outcome
Description
This function simulates a data set with survival outcome with given active biomarkers (prognostic and/or interacting with the treatment).
Usage
simdata(n, p, q.main, q.inter, prob.tt, m0, alpha.tt, beta.main,
beta.inter, b.corr, b.corr.by, wei.shape, recr, fu, timefactor,
active.main, active.inter)
simdataV(traindata, Nvalid)
Arguments
n |
the sample size. |
p |
the number of biomarkers. |
q.main |
the number of true prognostic biomarkers. |
q.inter |
the number of true biomarkers interacting with the treatement. |
prob.tt |
the treatement assignement probability. |
m0 |
the baseline median survival time. |
alpha.tt |
the effect of the treatment (in log-scale). |
beta.main |
the effect of the prognostic biomarkers (in log-scale). |
beta.inter |
the effect of the biomarkers interacting with the treatment (in log-scale). |
b.corr |
the correlation between biomarker blocks. |
b.corr.by |
the size of the blocks of correlated biomarkers. |
wei.shape |
the shape parameter of the Weibull distribution. |
recr |
the recruitment period duration. |
fu |
the follow-up period duration. |
timefactor |
the scale multiplicative factor for times (i.e. 1 = times in years). |
active.main |
the list of the prognostic biomarkers (not mandatory). |
active.inter |
the list of the biomarkers interacting with the treatment (not mandatory). |
traindata |
the training set returned by |
Nvalid |
the sample size of the new validation data set. |
Details
The simdata
function generates p
Gaussian unit-variance (\sigma
= 1) biomarkers including autoregressive correlation (\sigma
_ij = b.corr
^|i-j|) within b.corr.by
-biomarker blocks. The number of active biomarkers and their effect sizes (in log-scale) can be specified using q.main
and beta.main
for the true prognostic biomarkers and using q.inter
and beta.inter
for the true treatment-effect modifiers. A total of n
patients is generated and randomly assigned to the experimental (coded as +0.5, with probability prob.tt
) and control treatment (coded as -0.5). The treatment effect is specified using alpha.tt
(in log-scale). Survival times are generated using a Weibull with shape wei.shape
(i.e. 1 = exponential distribution) and patient-specific scale depending on the baseline median survival time m0
and the biomarkers values of the patient.
Censor status is generated by considering independant censoring from a U(fu
, fu
+ recr
) distribution, reflecting a trial with recr
years of accrual and fu
years of follow-up.
Another data set with the same characteristics as the one generated by simdata
can be obtained by using the simdataV
function.
Value
A simulated data.frame
object.
Author(s)
Nils Ternes, Federico Rotolo, and Stefan Michiels
Maintainer: Nils Ternes nils.ternes@yahoo.com
Examples
set.seed(123456)
sdata <- simdata(
n = 500, p = 100, q.main = 5, q.inter = 5,
prob.tt = 0.5, alpha.tt = -0.5,
beta.main = c(-0.5, -0.2), beta.inter = c(-0.7, -0.4),
b.corr = 0.6, b.corr.by = 10,
m0 = 5, wei.shape = 1, recr = 4, fu = 2,
timefactor = 1,
active.inter = c("bm003", "bm021", "bm044", "bm049", "bm097"))
newdata <- simdataV(
traindata = sdata,
Nvalid = 500)