BayesID_AFT {SemiCompRisks} | R Documentation |
The function to implement Bayesian parametric and semi-parametric analyses for semi-competing risks data in the context of accelerated failure time (AFT) models.
Description
Independent semi-competing risks data can be analyzed using AFT models that have a hierarchical structure. The proposed models can accomodate left-truncated and/or interval-censored data. An efficient computational algorithm that gives users the flexibility to adopt either a fully parametric (log-Normal) or a semi-parametric (Dirichlet process mixture) model specification is developed.
Usage
BayesID_AFT(Formula, data, model = "LN", hyperParams, startValues,
mcmcParams, na.action = "na.fail", subset=NULL, path=NULL)
Arguments
Formula |
a |
data |
a data.frame in which to interpret the variables named in |
model |
The specification of baseline survival distribution: "LN" or "DPM". |
hyperParams |
a list containing lists or vectors for hyperparameter values in hierarchical models. Components include,
|
startValues |
a list containing vectors of starting values for model parameters. It can be specified as the object returned by the function |
mcmcParams |
a list containing variables required for MCMC sampling. Components include,
|
na.action |
how NAs are treated. See |
subset |
a specification of the rows to be used: defaults to all rows. See |
path |
the name of directory where the results are saved. |
Details
We view the semi-competing risks data as arising from an underlying illness-death model system in which individuals may undergo one or more of three transitions: 1) from some initial condition to non-terminal event, 2) from some initial condition to terminal event, 3) from non-terminal event to terminal event. Let ,
denote time to non-terminal and terminal event from subject
. We propose to directly model the times of the events via the following AFT model specification:
where is a vector of transition-specific covariates,
is a corresponding vector of transition-specific regression parameters and
is a transition-specific random variable whose distribution determines that of the corresponding transition time,
.
is a study participant-specific random effect that induces positive dependence between the two event times, thereby performing a role analogous to that performed by frailties in models for the hazard function.
Let
denote the time at study entry (i.e. the left-truncation time). Furthermore, suppose that study participant
was observed at follow-up times
and let
denote the time to the end of study or to administrative right-censoring. Considering interval-censoring for both events, the times to non-terminal and terminal event for the
study participant satisfy
for some
and
for some
, respectively. Then the observed outcomes for the
study participant can be succinctly denoted by
.
For the Bayesian semi-parametric analysis, we proceed by adopting independent DPM of normal distributions for each . More precisely,
is taken to be an independent draw from a mixture of
normal distributions with means and variances (
,
), for
. Since the class-specific
are not known, they are taken to be draws from some common distribution,
, often referred to as the centering distribution. Furthermore, since the ‘true’ class membership for any given study participant is not known, we let
denote the probability of belonging to the
class for transition
and
=
the collection of such probabilities. Note,
is defined at the level of the population (i.e. is not study participant-specific) and its components add up to 1.0. In the absence of prior knowledge regarding the distribution of class memberships for the
individuals across the
classes,
is assumed to follow a conjugate symmetric Dirichlet
distribution, where
is referred to as the precision parameter. The finite mixture distribution can then be succinctly represented as:
Letting approach infinity, this specification is referred to as a DPM of normal distributions. In our proposed framework, we specify a Gamma(
,
) hyperprior for
. For regression parameters, we adopt non-informative flat priors on the real line. For
=
, we assume that each
is an independent random draw from a Normal(0,
) distribution. In the absence of prior knowledge on the variance component
, we adopt a conjugate inverse-Gamma hyperprior, IG(
,
). Finally, We take the
as a normal distribution centered at
with a variance
for
and an IG(
,
) for
.
For the Bayesian parametric analysis, we build on the log-Normal formulation and take the to follow independent Normal(
,
) distributions,
=1,2,3. For location parameters
, we adopt non-informative flat priors on the real line. For
, we adopt independent inverse Gamma distributions, denoted IG(
,
). For
,
, and
, we adopt the same priors as those adopted for the DPM model.
Value
BayesID_AFT
returns an object of class Bayes_AFT
.
Note
The posterior samples of are saved separately in
working directory/path
.
For a dataset with large ,
nGam_save
should be carefully specified considering the system memory and the storage capacity.
Author(s)
Kyu Ha Lee and Sebastien Haneuse
Maintainer: Kyu Ha Lee <klee15239@gmail.com>
References
Lee, K. H., Rondeau, V., and Haneuse, S. (2017),
Accelerated failure time models for semicompeting risks data in the presence of complex censoring, Biometrics, 73, 4, 1401-1412.
Alvares, D., Haneuse, S., Lee, C., Lee, K. H. (2019),
SemiCompRisks: An R package for the analysis of independent and cluster-correlated semi-competing risks data, The R Journal, 11, 1, 376-400.
See Also
initiate.startValues_AFT
, print.Bayes_AFT
, summary.Bayes_AFT
, predict.Bayes_AFT
Examples
## Not run:
# loading a data set
data(scrData)
scrData$y1L <- scrData$y1U <- scrData[,1]
scrData$y1U[which(scrData[,2] == 0)] <- Inf
scrData$y2L <- scrData$y2U <- scrData[,3]
scrData$y2U[which(scrData[,4] == 0)] <- Inf
scrData$LT <- rep(0, dim(scrData)[1])
form <- Formula(LT | y1L + y1U | y2L + y2U ~ x1 + x2 + x3 | x1 + x2 | x1 + x2)
#####################
## Hyperparameters ##
#####################
## Subject-specific random effects variance component
##
theta.ab <- c(0.5, 0.05)
## log-Normal model
##
LN.ab1 <- c(0.3, 0.3)
LN.ab2 <- c(0.3, 0.3)
LN.ab3 <- c(0.3, 0.3)
## DPM model
##
DPM.mu1 <- log(12)
DPM.mu2 <- log(12)
DPM.mu3 <- log(12)
DPM.sigSq1 <- 100
DPM.sigSq2 <- 100
DPM.sigSq3 <- 100
DPM.ab1 <- c(2, 1)
DPM.ab2 <- c(2, 1)
DPM.ab3 <- c(2, 1)
Tau.ab1 <- c(1.5, 0.0125)
Tau.ab2 <- c(1.5, 0.0125)
Tau.ab3 <- c(1.5, 0.0125)
##
hyperParams <- list(theta=theta.ab,
LN=list(LN.ab1=LN.ab1, LN.ab2=LN.ab2, LN.ab3=LN.ab3),
DPM=list(DPM.mu1=DPM.mu1, DPM.mu2=DPM.mu2, DPM.mu3=DPM.mu3, DPM.sigSq1=DPM.sigSq1,
DPM.sigSq2=DPM.sigSq2, DPM.sigSq3=DPM.sigSq3, DPM.ab1=DPM.ab1, DPM.ab2=DPM.ab2,
DPM.ab3=DPM.ab3, Tau.ab1=Tau.ab1, Tau.ab2=Tau.ab2, Tau.ab3=Tau.ab3))
###################
## MCMC SETTINGS ##
###################
## Setting for the overall run
##
numReps <- 300
thin <- 3
burninPerc <- 0.5
## Setting for storage
##
nGam_save <- 10
nY1_save <- 10
nY2_save <- 10
nY1.NA_save <- 10
## Tuning parameters for specific updates
##
## - those common to all models
betag.prop.var <- c(0.01,0.01,0.01)
mug.prop.var <- c(0.1,0.1,0.1)
zetag.prop.var <- c(0.1,0.1,0.1)
gamma.prop.var <- 0.01
##
mcmcParams <- list(run=list(numReps=numReps, thin=thin, burninPerc=burninPerc),
storage=list(nGam_save=nGam_save, nY1_save=nY1_save, nY2_save=nY2_save, nY1.NA_save=nY1.NA_save),
tuning=list(betag.prop.var=betag.prop.var, mug.prop.var=mug.prop.var,
zetag.prop.var=zetag.prop.var, gamma.prop.var=gamma.prop.var))
#################################################################
## Analysis of Independent Semi-competing risks data ############
#################################################################
###############
## logNormal ##
###############
##
myModel <- "LN"
myPath <- "Output/01-Results-LN/"
startValues <- initiate.startValues_AFT(form, scrData, model=myModel, nChain=2)
##
fit_LN <- BayesID_AFT(form, scrData, model=myModel, hyperParams,
startValues, mcmcParams, path=myPath)
fit_LN
summ.fit_LN <- summary(fit_LN); names(summ.fit_LN)
summ.fit_LN
pred_LN <- predict(fit_LN, time = seq(0, 35, 1), tseq=seq(from=0, to=30, by=5))
plot(pred_LN, plot.est="Haz")
plot(pred_LN, plot.est="Surv")
#########
## DPM ##
#########
##
myModel <- "DPM"
myPath <- "Output/02-Results-DPM/"
startValues <- initiate.startValues_AFT(form, scrData, model=myModel, nChain=2)
##
fit_DPM <- BayesID_AFT(form, scrData, model=myModel, hyperParams,
startValues, mcmcParams, path=myPath)
fit_DPM
summ.fit_DPM <- summary(fit_DPM); names(summ.fit_DPM)
summ.fit_DPM
pred_DPM <- predict(fit_DPM, time = seq(0, 35, 1), tseq=seq(from=0, to=30, by=5))
plot(pred_DPM, plot.est="Haz")
plot(pred_DPM, plot.est="Surv")
## End(Not run)