bayesCureRateModel-package {bayesCureRateModel} | R Documentation |
Bayesian Cure Rate Modeling for Time-to-Event Data
Description
A fully Bayesian approach in order to estimate a general family of cure rate models under the presence of covariates, see Papastamoulis and Milienos (2023) <doi:10.48550/arXiv.2310.06926>. The promotion time can be modelled (a) parametrically using typical distributional assumptions for time to event data (including the Weibull, Exponential, Gompertz, log-Logistic distributions), or (b) semiparametrically using finite mixtures of Gamma distributions. Posterior inference is carried out by constructing a Metropolis-coupled Markov chain Monte Carlo (MCMC) sampler, which combines Gibbs sampling for the latent cure indicators and Metropolis-Hastings steps with Langevin diffusion dynamics for parameter updates. The main MCMC algorithm is embedded within a parallel tempering scheme by considering heated versions of the target posterior distribution.
The main function of the package is cure_rate_MC3
. See details for a brief description of the model.
Details
Let denote the observed data, which correspond to time-to-event data or censoring times. Let also
denote the covariates for subject
,
.
Assuming that the observations are independent, the observed likelihood is defined as
where if the
-th observation corresponds to time-to-event while
indicates censoring time. The parameter vector
is decomposed as
where
-
are the parameters of the promotion time distribution whose cumulative distribution and density functions are denoted as
and
, respectively.
-
are the regression coefficients with
denoting the number of columns in the design matrix (it may include a constant term or not).
-
-
.
The population survival and density functions are defined as
whereas,
Finally, the cure rate is affected through covariates and parameters as follows
where .
The promotion time distribution can be a member of standard families (currently available are the following: Exponential, Weibull, Gamma, Lomax, Gompertz, log-Logistic) and in this case . Also considered is the Dagum distribution, which has three parameters
. In case that the previous parametric assumptions are not justified, the promotion time can belong to the more flexible family of finite mixtures of Gamma distributions. For example, assume a mixture of two Gamma distributions of the form
where
denotes the density of the Gamma distribution with parameters (shape) and
(rate).
For the previous model, the parameter vector is
where .
More generally, one can fit a mixture of Gamma distributions. The appropriate model can be selected according to information criteria such as the BIC.
The binary vector contains the (latent) cure indicators, that is,
if the
-th subject is susceptible and
if the
-th subject is cured.
denotes the subset of
containing the censored subjects, whereas
is the (complementary) subset of uncensored subjects. The complete likelihood of the model is
and
denote the probability density and survival function of the susceptibles, respectively, that is
Index of help topics:
bayesCureRateModel-package Bayesian Cure Rate Modeling for Time-to-Event Data complete_log_likelihood_general Logarithm of the complete log-likelihood for the general cure rate model. cure_rate_MC3 Main function of the package cure_rate_mcmc The basic MCMC scheme. log_dagum PDF and CDF of the Dagum distribution log_gamma PDF and CDF of the Gamma distribution log_gamma_mixture PDF and CDF of a Gamma mixture distribution log_gompertz PDF and CDF of the Gompertz distribution log_logLogistic PDF and CDF of the log-Logistic distribution. log_lomax PDF and CDF of the Lomax distribution log_weibull PDF and CDF of the Weibull distribution marriage_dataset Marriage data plot.bayesCureModel Plot method print.bayesCureModel Print method summary.bayesCureModel Summary method.
Author(s)
Panagiotis Papastamoulis and Fotios S. Milienos
Maintainer: Panagiotis Papastamoulis <papapast@yahoo.gr>
References
Papastamoulis and Milienos (2023). Bayesian inference and cure rate modeling for event history data. arXiv:2310.06926
See Also
Examples
# TOY EXAMPLE (very small numbers... only for CRAN check purposes)
# simulate toy data
set.seed(10)
n = 4
stat = rbinom(n, size = 1, prob = 0.5)
x <- cbind(1, matrix(rnorm(n), n, 1))
y <- rexp(n)
# run a weibull model with default prior setup
# considering 2 heated chains
fit1 <- cure_rate_MC3(y = y, X = x, Censoring_status = stat,
promotion_time = list(distribution = 'weibull'),
nChains = 2,
nCores = 1,
mcmc_cycles = 3, sweep=2)
# print method
fit1
# summary method
summary1 <- summary(fit1)
# WARNING: the following parameters
# mcmc_cycles, nChains
# should take _larger_ values. E.g. a typical implementation consists of:
# mcmc_cycles = 15000, nChains = 12
# run a Gamma mixture model with K = 2 components and default prior setup
fit2 <- cure_rate_MC3(y = y, X = x, Censoring_status = stat,
promotion_time = list(
distribution = 'gamma_mixture',
K = 2),
nChains = 8, nCores = 2,
mcmc_cycles = 10)
summary2 <- summary(fit2)