surv.simulate {BFI} | R Documentation |
Generate survival data with predefined censoring rates for proportional hazards models
Description
surv.simulate
simulates one or multiple (right-censored) survival datasets for proportional hazards models by simultaneously incorporating a baseline hazard function from three different survival distributions (exponential, Weibull and Gompertz), a random censoring time generated from a uniform distribution with an known/unknown upper limit, and a set of baseline covariates.
When the upper limit of the uniform censoring time distribution is unknown, surv.simulate
can be used separately to obtain the upper limit with a predefined censoring rate.
Usage
surv.simulate(L = 1, Z, beta, a, b, u1 = 0, u2, cen_rate,
gen_data_from = c("exp", "weibul", "gomp"),
only_u2 = FALSE, n.rep = 100, Trace = FALSE)
Arguments
L |
the number of datasets to be generated. Default is |
Z |
a list of |
beta |
the vector of the (true) coefficients values, with a length of |
a |
scale parameter, which should be non-negative. See ‘Details’ for the form of the cumulative hazard that can be used. |
b |
shape/location parameter, which should be non-negative. It is not used when |
u1 |
a known non-negative lower limit of the uniform distribution for generating random censoring time. Default is |
u2 |
an non-negative upper limit of the uniform random censoring time distribution. The upper limit can be unknown ( |
cen_rate |
a value representing the proportion of observations in the simulated survival data that are censored. The range of this argument is from 0 to 1. When the upper limit is known, |
gen_data_from |
a description of the distribution from which the time to event is generated. This is a character string and can be |
only_u2 |
logical flag for calculating only the upper limit of the uniform censoring time distribution. If |
n.rep |
a scalar specifying the number of iterations. This argument is exclusively used in the case of the |
Trace |
logical flag indicating whether the output of the desired |
Details
surv.simulate
function generates L
simulated right-censored survival datasets from exponential, Weibull, or Gompertz distributions, incorporating the covariates, Z
, distributed according to a multivariate normal
distribution, with censoring time generated from a uniform distribution Uniform(u1, u2)
, where u1
is known but u2
can be either known or unknown.
surv.simulate()
can also be used to calculate the unknown upper limit of the uniform distribution, u2
, with a predefined censoring rate. To do this, set u2 = NULL
and only_u2 = TRUE
. In this case, the datasets are not generated; only u2
is.
surv.simulate()
uses a root-finding algorithm to select the censoring parameter that achieves predefined censoring rates in the simulated survival data.
When gen_data_from =
“exp”:
the cumulative baseline hazard function is considered as
\Lambda_0=a t
,the event time for the
\ell^{\text{th}}
dataset,T_\ell
, is computed by- log(u) \ exp(- Z_\ell \boldsymbol{\beta}) / a
, whereu
follows a standard uniform distribution;
For gen_data_from =
“weibul”:
the cumulative hazard function is as
\Lambda_0=a t ^ b
,the event time is computed by
T_\ell= (- log(u) \ exp(- Z_\ell \boldsymbol{\beta}) / a)^{1/b}
, whereu
follows a standard uniform distribution;
For gen_data_from =
“gomp”:
the cumulative hazard function is as
\Lambda_0=a (exp(b t) - 1) / b
,the event time is computed by
T_\ell= \log(1- log(u) \ exp(- Z_\ell \boldsymbol{\beta}) b / a) / b
, whereu
follows a standard uniform distribution;
Finally the survival time is obtained by \tilde{T}_\ell=\min\{T_\ell , C_\ell \}
.
The function will be updated for gen_data_from =
“gomp”.
Value
surv.simulate
returns a list containing the following components:
D |
a list of |
censor_propor |
the vector of censoring proportions in the simulated datasets |
u1 |
the lower limit of the uniform distribution used to generate random censoring times with a predefined censoring rate. Sometimes this output is less than the value entered by the user, as it is adjusted to achieve the desired amount of censoring rate; |
u2 |
the upper limit of the uniform distribution used to generate random censoring times. If |
Author(s)
Hassan Pazira
Maintainer: Hassan Pazira hassan.pazira@radboudumc.nl
References
Pazira H., Massa E., Weijers J.A.M., Coolen A.C.C. and Jonker M.A. (2024). Bayesian Federated Inference for Survival Models, arXiv. <https://arxiv.org/abs/2404.17464>
See Also
Examples
# Setting a seed for reproducibility
set.seed(1123)
#-------------------------
# Simulating Survival data
#-------------------------
N <- c(7, 10, 13) # the sample sizes of 3 datasets
beta <- 1:4
p <- length(beta)
L <- 3
# Define a function to generate multivariate normal samples
mvrnorm_new <- function(n, mu, Sigma) {
pp <- length(mu)
e <- matrix(rnorm(n * pp), nrow = n)
return(crossprod(t(e), chol(Sigma)) + matrix(mu, n, pp, byrow = TRUE))
}
Z <- list()
for (z in seq_len(L)) {
Z[[z]] <- mvrnorm_new(n = N[z], mu = rep(0, p),
Sigma = diag(rep(1, p),p))
colnames(Z[[z]]) <- paste0("Z_",seq_len(ncol(Z[[z]])))
}
# One simulated dataset from exponential distribution with no censoring:
surv_data <- surv.simulate(Z = Z[[1]], beta = beta, a = exp(-.9),
cen_rate = 0, gen_data_from = "exp")
surv_data
surv_data$D[[1]][,1:2] # The simulated survival data
# Calculate only 'u2' with a predefined censoring rate of 0.4:
u2_new <- surv.simulate(Z = Z[1:2], beta = beta, a = exp(-.9),
b = exp(1.8), u1 = 0.1, only_u2 = TRUE,
cen_rate = 0.4, gen_data_from = "weibul")$u2
u2_new
# Two simulated datasets with a known 'u2':
# Using 'u2_new' to help control over censoring rate (was chosen 0.4)
surv.simulate(Z = Z[1:2], beta = beta, a = exp(-.9), b = exp(1.8),
u1 = 0.05, u2 = u2_new, gen_data_from = "weibul")
# Three simulated datasets from 'weibul' with an unknown 'u2':
surv.simulate(Z = Z, beta = beta, a = exp(-1), b = exp(1),
u1 = 0.01, cen_rate = 0.3, gen_data_from = "weibul")
# Two simulated datasets from 'gomp' with unknown 'u2' and censoring rate of 0.3:
surv.simulate(Z = Z[2:3], beta = beta, a = exp(1), b = exp(2), u1 = 0.1,
cen_rate = 0.3, gen_data_from = "gomp", Trace = TRUE)