This function simulates a data set with survival outcome with given active biomarkers (prognostic and/or interacting with the treatment).

simdata(n, p, q.main, q.inter, prob.tt, m0, alpha.tt, beta.main, beta.inter, b.corr, b.corr.by, wei.shape, recr, fu, timefactor, active.main, active.inter) simdataV(traindata, Nvalid)

`n` |
the sample size. |

`p` |
the number of biomarkers. |

`q.main` |
the number of true prognostic biomarkers. |

`q.inter` |
the number of true biomarkers interacting with the treatement. |

`prob.tt` |
the treatement assignement probability. |

`m0` |
the baseline median survival time. |

`alpha.tt` |
the effect of the treatment (in log-scale). |

`beta.main` |
the effect of the prognostic biomarkers (in log-scale). |

`beta.inter` |
the effect of the biomarkers interacting with the treatment (in log-scale). |

`b.corr` |
the correlation between biomarker blocks. |

`b.corr.by` |
the size of the blocks of correlated biomarkers. |

`wei.shape` |
the shape parameter of the Weibull distribution. |

`recr` |
the recruitment period duration. |

`fu` |
the follow-up period duration. |

`timefactor` |
the scale multiplicative factor for times (i.e. 1 = times in years). |

`active.main` |
the list of the prognostic biomarkers (not mandatory). |

`active.inter` |
the list of the biomarkers interacting with the treatment (not mandatory). |

`traindata` |
the training set returned by |

`Nvalid` |
the sample size of the new validation data set. |

The `simdata`

function generates `p`

Gaussian unit-variance (*σ* = 1) biomarkers including autoregressive correlation (*σ*_ij = `b.corr`

^|i-j|) within `b.corr.by`

-biomarker blocks. The number of active biomarkers and their effect sizes (in log-scale) can be specified using `q.main`

and `beta.main`

for the true prognostic biomarkers and using `q.inter`

and `beta.inter`

for the true treatment-effect modifiers. A total of `n`

patients is generated and randomly assigned to the experimental (coded as +0.5, with probability `prob.tt`

) and control treatment (coded as -0.5). The treatment effect is specified using `alpha.tt`

(in log-scale). Survival times are generated using a Weibull with shape `wei.shape`

(i.e. 1 = exponential distribution) and patient-specific scale depending on the baseline median survival time `m0`

and the biomarkers values of the patient.
Censor status is generated by considering independant censoring from a U(`fu`

, `fu`

+ `recr`

) distribution, reflecting a trial with `recr`

years of accrual and `fu`

years of follow-up.
Another data set with the same characteristics as the one generated by `simdata`

can be obtained by using the `simdataV`

function.

A simulated `data.frame`

object.

Nils Ternes, Federico Rotolo, and Stefan Michiels

Maintainer: Nils Ternes nils.ternes@yahoo.com

set.seed(123456) sdata <- simdata( n = 500, p = 100, q.main = 5, q.inter = 5, prob.tt = 0.5, alpha.tt = -0.5, beta.main = c(-0.5, -0.2), beta.inter = c(-0.7, -0.4), b.corr = 0.6, b.corr.by = 10, m0 = 5, wei.shape = 1, recr = 4, fu = 2, timefactor = 1, active.inter = c("bm003", "bm021", "bm044", "bm049", "bm097")) newdata <- simdataV( traindata = sdata, Nvalid = 500)

