generate.lm {coxed}R Documentation

Generate simulated durations using a baseline survivor function and proportional hazards


This function is called by sim.survdata and is not intended to be used by itself.


generate.lm(baseline, X = NULL, N = 1000, type = "none",
  beta = NULL, xvars = 3, mu = 0, sd = 1, censor = 0.1)



The baseline hazard, cumulative hazard, survival, failure PDF, and failure CDF as output by


A user-specified data frame containing the covariates that condition duration. If NULL, covariates are generated from normal distributions with means given by the mu argument and standard deviations given by the sd argument


Number of observations in each generated data frame


If "none" (the default) data are generated with no time-varying covariates or coefficients. If "tvc", data are generated with time-varying covariates, and if "tvbeta" data are generated with time-varying coefficients (see details)


A user-specified vector containing the coefficients that for the linear part of the duration model. If NULL, coefficients are generated from normal distributions with means of 0 and standard deviations of 0.1


The number of covariates to generate. Ignored if X is not NULL


If scalar, all covariates are generated to have means equal to this scalar. If a vector, it specifies the mean of each covariate separately, and it must be equal in length to xvars. Ignored if X is not NULL


If scalar, all covariates are generated to have standard deviations equal to this scalar. If a vector, it specifies the standard deviation of each covariate separately, and it must be equal in length to xvars. Ignored if X is not NULL


The proportion of observations to designate as being right-censored


If type="none" then the function generates idiosyncratic survival functions for each observation via proportional hazards: first the linear predictor is calculated from the X variables and beta coefficients, then the linear predictor is exponentiated and set as the exponent of the baseline survivor function. For each individual observation's survival function, a duration is drawn by drawing a single random number on U[0,1] and finding the time point at which the survival function first decreases past this value. See Harden and Kropko (2018) for a more detailed description of this algorithm.

If type="tvc", this function cannot accept user-supplied data for the covariates, as a time-varying covariate is expressed over time frames which themselves convey part of the variation of the durations, and we are generating these durations. If user-supplied X data is provided, the function passes a warning and generates random data instead as if X=NULL. Durations are drawn again using proportional hazards, and are passed to the permalgorithm function in the PermAlgo package to generate the time-varying data structure (Sylvestre and Abrahamowicz 2008).

If type="tvbeta" the first coefficient, whether coefficients are user-supplied or randomly generated, is interacted with the natural log of the time counter from 1 to T (the maximum time point for the baseline functions). Durations are generated via proportional hazards, and coefficients are saved as a matrix to illustrate their dependence on time.


Returns a list with the following components:

data The simulated data frame, including the simulated durations, the censoring variable, and covariates
beta The coefficients, varying over time if type is "tvbeta"
XB The linear predictor for each observation
exp.XB The exponentiated linear predictor for each observation
survmat An (N x T) matrix containing the individual survivor function at time t for the individual represented by row n
tvc A logical value indicating whether or not the data includes time-varying covariates


Jonathan Kropko <> and Jeffrey J. Harden <>


Harden, J. J. and Kropko, J. (2018). Simulating Duration Data for the Cox Model. Political Science Research and Methods

Sylvestre M.-P., Abrahamowicz M. (2008) Comparison of algorithms to generate event times conditional on time-dependent covariates. Statistics in Medicine 27(14):2618–34.

See Also

sim.survdata, permalgorithm


baseline <-, knots=8, spline=TRUE)
simdata <- generate.lm(baseline, N=1000, xvars=5, mu=0, sd=1, type="none", censor=.1)
simdata <- generate.lm(baseline, N=1000, xvars=5, mu=0, sd=1, type="tvc", censor=.1)
simdata <- generate.lm(baseline, N=1000, xvars=5, mu=0, sd=1, type="tvbeta", censor=.1)

[Package coxed version 0.3.3 Index]