simdata {biospear}R Documentation

Generation of data sets with survival outcome

Description

This function simulates a data set with survival outcome with given active biomarkers (prognostic and/or interacting with the treatment).

Usage

simdata(n, p, q.main, q.inter, prob.tt, m0, alpha.tt, beta.main,
  beta.inter, b.corr, b.corr.by, wei.shape, recr, fu, timefactor,
  active.main, active.inter)

simdataV(traindata, Nvalid)

Arguments

n

the sample size.

p

the number of biomarkers.

q.main

the number of true prognostic biomarkers.

q.inter

the number of true biomarkers interacting with the treatement.

prob.tt

the treatement assignement probability.

m0

the baseline median survival time.

alpha.tt

the effect of the treatment (in log-scale).

beta.main

the effect of the prognostic biomarkers (in log-scale).

beta.inter

the effect of the biomarkers interacting with the treatment (in log-scale).

b.corr

the correlation between biomarker blocks.

b.corr.by

the size of the blocks of correlated biomarkers.

wei.shape

the shape parameter of the Weibull distribution.

recr

the recruitment period duration.

fu

the follow-up period duration.

timefactor

the scale multiplicative factor for times (i.e. 1 = times in years).

active.main

the list of the prognostic biomarkers (not mandatory).

active.inter

the list of the biomarkers interacting with the treatment (not mandatory).

traindata

the training set returned by simdata, used to generate the new validation data set with the same characteristics.

Nvalid

the sample size of the new validation data set.

Details

The simdata function generates p Gaussian unit-variance (\sigma = 1) biomarkers including autoregressive correlation (\sigma_ij = b.corr^|i-j|) within b.corr.by-biomarker blocks. The number of active biomarkers and their effect sizes (in log-scale) can be specified using q.main and beta.main for the true prognostic biomarkers and using q.inter and beta.inter for the true treatment-effect modifiers. A total of n patients is generated and randomly assigned to the experimental (coded as +0.5, with probability prob.tt) and control treatment (coded as -0.5). The treatment effect is specified using alpha.tt (in log-scale). Survival times are generated using a Weibull with shape wei.shape (i.e. 1 = exponential distribution) and patient-specific scale depending on the baseline median survival time m0 and the biomarkers values of the patient. Censor status is generated by considering independant censoring from a U(fu, fu + recr) distribution, reflecting a trial with recr years of accrual and fu years of follow-up. Another data set with the same characteristics as the one generated by simdata can be obtained by using the simdataV function.

Value

A simulated data.frame object.

Author(s)

Nils Ternes, Federico Rotolo, and Stefan Michiels
Maintainer: Nils Ternes nils.ternes@yahoo.com

Examples

  set.seed(123456)
  sdata <- simdata(
    n = 500, p = 100, q.main = 5, q.inter = 5,
    prob.tt = 0.5, alpha.tt = -0.5,
    beta.main = c(-0.5, -0.2), beta.inter = c(-0.7, -0.4),
    b.corr = 0.6, b.corr.by = 10,
    m0 = 5, wei.shape = 1, recr = 4, fu = 2,
    timefactor = 1,
    active.inter = c("bm003", "bm021", "bm044", "bm049", "bm097"))

  newdata <- simdataV(
    traindata = sdata,
    Nvalid = 500)

[Package biospear version 1.0.2 Index]