R: A Bayesian survival regression with an error distribution...

bayessurvreg1 {bayesSurv}

R Documentation

A Bayesian survival regression with an error distribution expressed as a~normal mixture with unknown number of components

Description

A function to sample from the posterior distribution for a survival regression model

\log(T_{i,l}) = \beta^T x_{i,l} + b_i^T z_{i,l} + \varepsilon_{i,l},\quad i=1,\dots,N,\ l=1,\dots,n_i,

where distribution of \varepsilon_{i,l} is specified as a normal mixture with unknown number of components as in Richardson and Green (1997) and random effect b_i is normally distributed.

See Komárek (2006) or Komárek and Lesaffre (2007) for more detailed description of prior assumptions.

Sampled values are stored on a disk to be further worked out by e.g. coda or boa.

Usage

bayessurvreg1(formula, random,
   data = parent.frame(), subset,
   na.action = na.fail,
   x = FALSE, y = FALSE, onlyX = FALSE,
   nsimul = list(niter = 10, nthin = 1, nburn = 0,
                 nnoadapt = 0, nwrite = 10),
   prior = list(kmax = 5, k.prior = "poisson", poisson.k = 3,
                dirichlet.w = 1,
                mean.mu = NULL, var.mu = NULL,
                shape.invsig2 = 1.5,
                shape.hyper.invsig2 = 0.8, rate.hyper.invsig2 = NULL,
                pi.split = NULL, pi.birth = NULL,
                Eb0.depend.mix = FALSE),
   prior.beta, prior.b, prop.revjump,
   init = list(iter = 0, mixture = NULL, beta = NULL,
               b = NULL, D = NULL,
               y = NULL, r = NULL, otherp = NULL, u = NULL),
   store = list(y = TRUE, r = TRUE, b = TRUE, u = TRUE,
                MHb = FALSE, regresres = FALSE),
   dir,
   toler.chol = 1e-10, toler.qr = 1e-10, ...)

Arguments

formula

model formula for the ‘fixed’ part of the model, i.e. the part that specifies \beta^T x_{i,l}. See survreg for further details. Intercept is implicitely included in the model by estimation of the error distribution. As a consequence -1 in the model formula does not have any effect on the model.

The left-hand side of the formula must be an~objecy created using Surv.

If random is used then the formula must contain an identification of clusters in the form cluster(id), where id is a name of the variable that determines clusters, e.g.

Surv(time, event)~gender + cluster(id).

random

formula for the ‘random’ part of the model, i.e. the part that specifies b_i^T z_{i,l}. If omitted, no random part is included in the model. E.g. to specify the model with a random intercept, say random=~1. All effects mentioned in random should also be mentioned on the right-hand side of formula.

When some random effects are included the random intercept is added by default. It can be removed using e.g. random=~-1 + gender.

data

optional data frame in which to interpret the variables occuring in the formulas.

subset

subset of the observations to be used in the fit.

na.action

function to be used to handle any NAs in the data. The user is discouraged to change a default value na.fail.

x

if TRUE then the X matrix is returned. This matrix contain all columns appearing in both formula and random parameters.

y

if TRUE then the y matrix (of log-survival times) is returned.

onlyX

if TRUE, no McMC is performed. The function returns only a design matrix of your model (intercept excluded). It might be useful to set up correctly a parameter for a block update of \beta (regression parameters related to the fixed effects) and \gamma (means of the random effects, random intercept excluded) parameters in the model if Metropolis-Hastings is to be used instead of default Gibbs.

nsimul

a list giving the number of iterations of the McMC and other parameters of the simulation.

niter: total number of sampled values after discarding thinned ones, burn-up included.
nthin: thinning interval.
nburn: number of sampled values in a burn-up period after discarding thinned values. This value should be smaller than niter. If not, nburn is set to niter - 1. It can be set to zero.
nnoadapt: applicable if some blocks of parameters are updated using an adaptive Metropolis algorithm. This is a number of sampled values that are generated using an initial and fixed proposal covariance matrix. It should be smaller or equal to nburn. If not, nnoadapt is set to nburn.
nwrite: an interval at which sampled values are written to files.

prior

a list that identifies prior hyperparameters and prior choices. See accompanying paper for more details. Some prior parameters can be guessed by the function itself. If you want to do so, set such parameters to NULL. Set to NULL also the parameters that are not needed in your model.

kmax: value of k_{max}, upper limit for the number of mixture components. Its high values like 100 will usually correspond to \infty.
k.prior: a string specifying the prior distribution of k, number of mixture components. Valid are either “poisson”, “uniform”, or “fixed”. When “fixed” is given then the number of mixture components is not sampled.
poisson.k: prior hyperparameter \lambda for the number of mixture components $k$ if Poisson prior for this parameter is used.
dirichlet.w: prior hyperparameter \delta for the Dirichlet distribution of mixture weights w_1,\dots,w_k.
mean.mu: prior hyperparameter \xi for the mean of the normal prior for mixture means \mu_1,\dots,\mu_k.
var.mu: prior hyperparameter \kappa for the variance of the normal prior for mixture means \mu_1,\dots,\mu_k.
shape.invsig2: prior hyperparameter \zeta for the shape of the inverse-gamma distribution for the mixture variances \sigma_1^2,\dots,\sigma_k^2.
shape.hyper.invsig2: prior hyperparameter (shape) g for the gamma distribution of the parameter \eta. Remember, \eta is a scale parameter of the inverse-gamma distribution for the mixture variances \sigma_1^2,\dots,\sigma_k^2.
rate.hyper.invsig2: prior hyperparameter (rate) h for the gamma distribution of the parameter \eta. Remember, \eta is a scale parameter of the inverse-gamma distribution for the mixture variances \sigma_1^2,\dots,\sigma_k^2.
pi.split: probabilities of a split move within the reversible jump McMC. It must be a vector of length equal to kmax with the first component equal to 1 and the last component equal to 0. If NULL 2nd to (k-1)th components are set to 0.5.
pi.birth: probabilities of a birth move within the reversible jump McMC. It must be a vector of length equal to kmax with the first component equal to 1 and the last component equal to 0. If NULL 2nd to (k-1)th components are set to 0.5.
Eb0.depend.mix: this will normally be FALSE. Setting this option to TRUE served for some experiments during the development of this function. In principle, when this is set to TRUE and the random intercept is included in the model then it is assumed that the mean of the random intercept is not zero but \sum_{j=1}^k w_j\mu_j, i.e. the mean of the random intercept depends on mixture. However, this did not werk too well.

prior.beta

a list defining the blocks of \beta parameters (both fixed effects and means of random effects, except the random intercept) that are to be updated together (in a block), a description of how they are updated and a specification of priors. The list is assumed to have the following components.

mean.prior

a vector specifying a prior mean for each \beta parameter in the model.

var.prior

a vector specifying a prior variance for each \beta parameter. It is recommended to run the function bayessurvreg1 first with its argument onlyX set to TRUE to find out how the \betas are sorted. They must correspond to a design matrix X.

blocks

a list with the following components.

ind.block: a list with vectors with indeces of columns of the design matrix X defining the effect of \betas in the block. If not specified, all \beta parameters corresponding to fixed effects are updated in one block and remaining \beta parameters (means of random effects) in the second block using the Gibbs move.
cov.prop: a list with vectors with a lower triangle of the covariance matrix which is used in the normal proposal (use a command lower.tri with diag = TRUE to get a lower triangle from a matrix) when one of the Metopolis-like algorithms is used for a given block. This matrix is used at each iteration if the given block is updated using a standard random-walk Metropolis-Hastings step. If the block is updated using an adaptive Metropolis step this matrix is used only at start. If not specified and Metropolis-like algorith is required a diagonal matrix with prior variances for corresponding \beta on a diagonal is used. It is set to a vector of zeros of appropriate length when the Gibbs move is required for a given block.

type.upd

a character vector specifying the type of the update that will be used for each block. Valid are substrings of either "gibbs" or "adaptive.metropolis" or "random.walk.metropolis". Default is "gibbs" for all blocks.

mean.sampled

a vector of means of up to now sampled values. This component is useful when the adaptive Metropolis algorithm is used and we do not start from the beginning (e.g. already several iterations of McMC have already been performed). Otherwise, this component does not have to be filled.

eps.AM

a vector with \epsilon from the adaptive Metropolis algorithm for each block.

sd.AM

a vector specifying s_d, d = 1,\dots, D numbers from the adaptive Metropolis algorithm for each dimension. This vector must be of length equal at least to the length of the longest block. Defaults values are \frac{1}{d}2.4^2 where d denotes a length of the block.

weight.unif

a vector specifying the weight of the uniform component in the proposal for each block. If not specified, it is equal to 0.5 for all parameters.

half.range.unif

a vector of same length as the number of columns in the design matrix X specifying the half range of the uniform component of the proposal.

prior.b

a list defining the way in which the random effects are to be updated and the specification of priors for random effects related parameters. The list is assumed to have following components.

prior.D

a string defining the prior distribution for the covariance matrix of random effects D. It can be either “inv.wishart” or “sduniform”.

inv.wishart: in that case is assumed that the prior distribution of the matrix D is Inverse-Wishart with degrees of freedom equal to \tau and a scale matrix equal to S. When D is a matrix q\times q a prior expectation of D is equal to (\tau - q - 1)^{-1}S if \tau > q + 1. For q - 1 < \tau \leq q + 1 a prior expectation is not finite. Degrees of freedom parameter \tau does not have to be an integer. It has to only satisfy a condition \tau > q - 1. prior.b$df.D gives a prior degrees of freedom parameter \tau and prior.b$scale.D determines the scale matrix D. This is also the default choice.
sduniform: this can be used only when the random effect is univariate. Then the matrix D is just a scalar and the prior of \sqrt{D} (standard deviation of the univariate random effect) is assumed to be uniform on interval (0, S). The upper limit S is given by prior.b$scale.D.

df.D

degrees of freedom parameter \tau in the case that the prior of the matrix D is inverse-Wishart.

scale.D

a lower triangle of the scale matrix S in the case that the prior of the matrix D is inverse-Wishart or the upper limit S of the uniform distribution in the case that \sqrt{D} \sim \mbox{Unif}(0, S).

type.upd

a character vector specifying the type of the update. Valid are substrings of either "random.walk.metropolis" or "gibbs". Default is "gibbs". In contrast to \beta parameters, all random effects are updated using the same type of the move. If "random.walk.metropolis" is used, random effects may be divided into blocks in which they are updated. With "gibbs", there is only one block defined for all random effects. which are updated in one step using its full conditional distribution.

blocks

a list with the following components. This is set to NULL if type.upd = "gibbs".

ind.block

a list with vectors with indeces of random effects defining the block. Random intercept has always an index 1, remaining random effects have subsequent indeces according to their appearance in the design matrix X.

cov.prop

a list with vectors with a lower triangle of the covariance matrix which is used in the normal proposal (use a command lower.tri with diag = TRUE to get a lower triangle from a matrix) for a given block when

type.upd = "random.walk.metropolis".

weight.unif

a vector specifying the weight of the uniform component in the proposal for each block when

type.upd = "random.walk.metropolis".

If not specified, it is equal to 0.5 for all parameters. It is set to NULL if type.upd = "gibbs".

half.range.unif

a vector of same length as the number of random effects specifying the half range of the uniform component of the proposal when type.upd = "random.walk.metropolis". It is set to NULL if type.upd = "gibbs".

prop.revjump

a list of values defining in which way the reversible jumps will be performed.

algorithm

a string defining the algorithm used to generate canonical proposal vectors u = (u_{3k+1},\dots, u_{3k_{max}})' where u_{3k+1}, u_{3k+2}, u_{3k+3} are directly used when a jump to a space of higher dimension is proposed. These canonical proposal vectors are further transformed to give desired parameters (mixture component's weight, mean and variance). Valid values of prop.revjump$algorithm are substrings of "basic", "independent.av", "correlated.av". "basic" means that both components of vectors u and vectors u in time are generated independently from a standard uniform distribution. This corresponds to a basic reversible jumps McMC algorithm of Green (1995). Other two methods implement an auxiliary variable method of Brooks et al. (2003). The first one an independent auxiliary variable method where vectors u may be correlated in time however their components are independent and the second one the correlated auxiliary method where vectors u are correlated in time and also their components may be correlated. In both cases components of vectors u follow marginally a standard uniform distribution. A moody ring method of Brooks et al. (2003) is used to generate u vectors.

moody.ring

parameters for the moody ring when algorithm is either "independent.av" or "correlated.av". This is a two component vector with both components taking values between 0 and 0.5 defining the strength of a correlation in time and between the components of u vectors. This vector is ignored when algorithm = "basic". The first component of this vector determines dependence between u vectors in time (\varepsilon in Brooks et al. (2003)), the second component determines dependence between components of u vectors (\delta in Brooks et al. (2003)). The second compoenent is ignored when algorithm = "independent.av". Note that both \varepsilon and \delta do not have a meaning of correlation. They determine a range of additional uniform distributions. So that their values equal to 0 mean perfect correlation and their values equal to 0.5 mean independence. I.e. "correlated.av" with \delta = 0.5 is same as "independent.av" and "correlated.av" with \delta = 0.5, \varepsilon = 0.5 is same as "basic".

transform.split.combine

a description of how the canonical variables u are to be transformed to give new values of mixture component's weight, mean and variance when a split move is proposed. Possible values are substrings of "richardson.green", "brooks" and "identity". In all cases, the (0, 1) canonical variables u are transformed to (0, 1) variates v that are than used to compute new values of mixture component's weight, mean and variance using a method of moments matching described in Richardson and Green (1997). When "identity", no further transformation is performed, when "richardson.green", u vectors are transformed such that the components of resulting v vectors follow independently beta distributions with parameters given further by p = prop.revjump$transform.split.combine.parms such that in the triplet of v's used in a particular split move, v_1 \sim beta(p_1, p_2), v_2 \sim beta(p_3, p_4), v_3 \sim beta(p_5, p_6). When "brooks" v_2 is further transformed by |2v_2 - 1|. Default values of

prop.revjump$transform.split.combine$parms

is c(2, 2, 2, 2, 1, 1).

transform.split.combine.parms

see above.

transform.birth.death

a description of how the canonical variables u are to be transformed to give new values of mixture component's weight, mean and variance when a birth move is proposed. At this moment only one value is possible: "richardson.green" implementing the proposal as in Richardson and Green (1997).

init

a list of the initial values to start the McMC. Set to NULL such parameters that you want the program should itself sample for you or parameters that are not needed in your model.

iter: index of the iteration to which initial values correspond, usually zero.
mixture: initial mixture for the error random variable \varepsilon. It must a vector of length 1 + 3*kmax, where mixture[1] gives initial number of mixture of components k, mixture[2:(k+1)] gives initial mixture weights, mixture[(2+kmax):(2+kmax+k-1)] gives initial mixture means, mixture[(2+2*kmax):(2+2*kmax+k-1)] gives initial mixture variances. Remaining components of this vector are ignored.
beta: initial values of regression parameters in the same order as columns of the design matrix X. Call the function bayessurvreg1 with onlyX = TRUE to see how the columns are sorted. Remember, beta in this function contains both fixed effects \beta and means of random effect \gamma in the notation of the accompanying paper except the mean of the random intercept which is always zero.
b: initial values of random effects b_i for each cluster. This must a matrix of size q \times N or a vector of length q*N, where q is a number of random effects and N number of clusters, one column per cluster.
D: initial value for the covariance matrix of random effects D. Only its lower triangle must be given in a vector, e.g. c(d[1,1], d[2,1], d[3,1], d[2,2], d[3,2], d[3,3]) for a matrix 3 \times 3.
y: initial values of true log-event times. This must be a vector of length \sum_{i=1}^N n_i.
r: initial values of component labels r_{i,l}. This must be a vector of length \sum_{i=1}^N n_i.
otherp: initial values for other parameters. At this moment, only a value of the parameter \eta is given here.
u: initial canonical proposal vector of length 3k_{max}. When initial number of compoents given by init$mixture[1] is k, effectively only last 3k_{max} - 3*k components of the initial u vector are used. Further, when prop.revjump$algorithm = "correlated.av", the first component of init$u (init$u[1]) contains an initial mood parameter (C_0 in Brooks et al. (2003)) for the moody ring.

store

a list that defines which sampled values besides regression parameters \beta, means of random effects \gamma (both stored in a file called beta.sim), a covariance matrix of random effects D (stored in a file D.sim), the mixture (stored in file mixmoment.sim, mweight.sim, mmean.sim, mvariance.sim), values of other parameters - \eta (stored in a file otherp.sim), values of log-likelihoods (stored in a file loglik.sim), information concerning the performance of the reversible jump McMC and acceptance of regression parameters (stored in a file MHinfo.sim), iteration indeces (stored in a file iteration.sim) are to be stored. The list store has the following components.

y: if TRUE sampled true log-event times are stored.
r: if TRUE sampled component labels are stored.
b: if TRUE sampled values of random effects b_i are stored.
u: if TRUE sampled values of canonical proposal vectors for the reversible jump McMC are stored.
MHb: if TRUE information concerning the performance of the Metropolis-Hastings algorithm for the update of random effects (if used instead of a dafault Gibbs) is stored.
regresres: if TRUE sampled values of regression residuals at each iteration are stored. The regression residual is defined as res_{i,l} = \log(t_{i,l}) - \beta^T x_{i,l} - b_i^T z_{i,l}.

In the case that either store$y, or store$r, or store$b, or store$u are FALSE, only the last values of either y, or r, or b, or u at the time of writting of remaining quantities are stored in appropriate files (without headers) to be possibly used by bayessurvreg1.files2init function.

dir

a string that specifies a directory where all sampled values are to be stored.

toler.chol

tolerance for the Cholesky decomposition.

toler.qr

tolerance for the QR decomposition.

...

who knows?

Value

A list of class bayessurvreg containing an information concerning the initial values and prior choices.

Files created

Additionally, the following files with sampled values are stored in a directory specified by dir parameter of this function (some of them are created only on request, see store parameter of this function).

iteration.sim

one column labeled iteration with indeces of McMC iterations to which the stored sampled values correspond.

loglik.sim

two columns labeled loglik and randomloglik.

\mbox{\code{loglik}} = \sum_{i=1}^{N}\sum_{l=1}^{n_i}\Biggl[ \biggl\{ \log\Bigl(\frac{1}{\sqrt{2\pi\sigma_{r_{i,l}}^2}}\Bigr) -\frac{(y_{i,l} - \beta^T x_{i,l} - b_i^T z_{i,l} - \mu_{r_{i,l}})^2}{2\sigma_{r_{i,l}}^2} \biggr\} \Biggr],

where y_{i,l} denotes (sampled) (i,l)th true log-event time, b_i sampled value of the random effect vector for the ith cluster, \beta sampled value of the regression parameter \beta and k, w_j, \mu_j, \sigma_j^2, j = 1,\dots,k sampled mixture at each iteration.

\mbox{\code{randomloglik}} = \sum_{i=1}^{N}\log\Bigl(g(b_i)\Bigr),

where g denotes a density of (multivariate) normal distribution N(\gamma, D), where \gamma is a sampled value of the mean of random effect vector and D is a sampled value of the covariance matrix of the random effects at each iteration.

mixmoment.sim

three columns labeled k, Intercept and Scale. These are the number of mixture components, mean and standard deviation of the sampled error distribution (mixture) at each iteration.

mweight.sim

each row contains mixture weights w_1,\dots,w_k at each iteration. From the header of this file, maximal number of mixture components specified in the prior can be derived.

mmean.sim

each row contains mixture means \mu_1,\dots,\mu_k at each iteration. From the header of this file, maximal number of mixture components specified in the prior can be derived.

mvariance.sim

each row contains mixture variances \sigma^2_1,\dots,\sigma^2_k at each iteration. From the header of this file, maximal number of mixture components specified in the prior can be derived.

beta.sim

columns labeled according to name of the design matrix. These are sampled values of regression parameters \beta and means of random effects \gamma (except the mean of the random intercept which is zero).

b.sim

columns labeled nameb[1].id[1], ..., nameb[q].id[1], ..., nameb[1].id[N], ..., nameb[q].id[N], where q is a dimension of the random effect vector b_i and N number of clusters. nameb is replaced by appropriate column name from the design matrix and id is replaced by identificator of the clusters. This gives sampled values of the random effects for each cluster.

D.sim

columns labeled det, D.s.t, s = 1,..., q, t = s,...,q, where q is dimension of the random effect vector b_i. Column det gives a determinant of the covariance matrix D of the random effects at each iteration, remaining columns give a lower triangle of this matrix at each iteration.

Y.sim

columns labeled Y[m] where m goes from 1 to \sum_{i=1}^{N}n_i. This gives sampled log-event times for each observation in the dataset at each iteration.

r.sim

columns labeled r[m] where m goes from 1 to \sum_{i=1}^{N}n_i. This gives sampled mixture labels for each observation in the dataset at each iteration.

otherp.sim

Currently only one column labeled eta that gives sampled values of the hyperparameter \eta.

MHinfo.sim

this gives the information concerning the performance of reversible jump algorithm and a sampler of regression parameters \beta and means of random effects \gamma. It has columns

accept.spl.comb: relative frequency of accepted split-combine moves up to that iteration.
split: relative frequency of proposed split moves up to that iteration.
accept.birth.death: relative frequency of accepted birth-death moves up to that iteration.
birth: relative frequency of proposed birth moves up to that iteration.
beta.block.m: with m going from 1 to number of defined blocks of beta parameters. This gives a relative frequency of accepted proposals for each block up to that iteration. When Gibbs move is used, these should be columns of ones.

MHbinfo.sim

this gives the information concerning the performance of a sampler for random effects (relative frequency of accepted values for each cluster and each block of random effects updated together). When Gibbs move is used only ones are seen in this file.

u.sim

Sampled values of canonical proposal variables for reversible jump algorithm are stored here. This file is useful only when trying to restart the simulation from some specific point.

regresres.sim

columns labeled res[m] where m goes from 1 to \sum_{i=1}^{N}n_i This stores so called regression residuals for each observation at each iteration. This residual is defined as

res_{i,l} = y_{i,l} - \beta^T x_{i,l} - b_i z_{i,l},\qquad i=1\dots,N,\quad l=1,\dots,n_i,

where y_{i,l} is a (sampled) log-event time at each iteration.

Author(s)

Arnošt Komárek arnost.komarek@mff.cuni.cz

References

Komárek, A. (2006). Accelerated Failure Time Models for Multivariate Interval-Censored Data with Flexible Distributional Assumptions. PhD. Thesis, Katholieke Universiteit Leuven, Faculteit Wetenschappen.

Komárek, A. and Lesaffre, E. (2007). Bayesian accelerated failure time model for correlated interval-censored data with a normal mixture as an error distribution. Statistica Sinica, 17, 549 - 569.

Brooks, S. P., Giudici, P., and Roberts, G. O. (2003). Efficient construction of reversible jump Markov chain Monte Carlo proposal distribution (with Discussion). Journal of the Royal Statistical Society B, 65, 3 - 55.

Green, P. J. (1995). Reversible jump MCMC computation and Bayesian model determination. Biometrika, 82, 711 - 732.

Richardson, S., and Green, P. J. (1997). On Bayesian analysis of mixtures with unknown number of components (with Discussion). Journal of the Royal Statistical Society B, 59, 731 - 792.

Examples

## See the description of R commands for
## the models described in
## Komarek (2006),
## Komarek and Lesaffre (2007).
## 
## R commands available
## in the documentation
## directory of this package as
## - ex-cgd.R and
##   https://www2.karlin.mff.cuni.cz/~komarek/software/bayesSurv/ex-cgd.pdf
##
## - ex-tandmobMixture.R and
##   https://www2.karlin.mff.cuni.cz/~komarek/software/bayesSurv/ex-tandmobMixture.pdf
##

[Package bayesSurv version 3.7 Index]