joint {INLAjoint}R Documentation

Fit a multivariate joint model for longitudinal and/or survival data

Description

This function fits a multivariate joint model for longitudinal and/or survival data. The longitudinal outcomes are modeled with mixed effects models and can handle various distributions. The survival outcomes (i.e., terminal event with possibly competing risks) are modeled with Cox proportional hazards regression models. Various association structures can be specified between the longitudinal and survival outcomes. The inference is based on Integrated Nested Laplace Approximations (Rue et al., 2009).

Usage

joint(
  formSurv = NULL,
  formLong = NULL,
  dataSurv = NULL,
  dataLong = NULL,
  id = NULL,
  timeVar = NULL,
  family = "gaussian",
  link = "default",
  basRisk = "rw1",
  NbasRisk = 15,
  cutpoints = NULL,
  assoc = NULL,
  assocSurv = NULL,
  corLong = FALSE,
  corRE = TRUE,
  dataOnly = FALSE,
  longOnly = FALSE,
  control = list()
)

Arguments

formSurv

the formula for the time-to-event outcome, with the response given as an inla.surv() object. Keep it as NULL if no survival part is needed and give a list of formulas for competing risks.

formLong

the formula for the longitudinal outcome, structured as with the lme4 package (Mächler et al., 2015) for linear mixed-effects models (i.e., including random effects within parenthesis). Keep it as NULL if no longitudinal part is needed and give a list of formulas for multivariate longitudinal.

dataSurv

the dataset for the survival part. Keep it as NULL if no survival part is needed or if the survival data is in the longitudinal dataset (it will extract the last line for each individual as the survival dataset).

dataLong

the dataset for the longitudinal part. Keep it as NULL if no longitudinal part is needed. For multivariate longitudinal models, either give one dataset with all outcomes and covariates or a list of datasets for each longitudinal marker.

id

the name of the variable to identify individuals or grouped repeated measurements. Keep is as NULL if no longitudinal part is needed.

timeVar

a character string (or a vector) giving the name of the time-varying variable(s). Functions of time can be included in formulas, they first need to be set up as a univariate function with name fX, where X is a number between 1 and 20. Then the function can be used directly in the formula (see example below).

family

a character string (or a vector) giving the name of families for the longitudinal outcomes. The list of the available families is given by names(inla.models()$likelihood).

link

a character string (or a vector) giving the link function associated to the families for the longitudinal outcomes. The links available for a family is given in the associated doc: inla.doc("familyName"). The link should be a vector of the same size as the family parameter and should be set to "default" for default (i.e., identity for gaussian, log for poisson, logit for bimomial,...).

basRisk

the baseline risk of event (should be a vector in case of competing risks). It can be defined as parametric with either "exponentialsurv" for exponential baseline or "weibullsurv" for Weibull baseline (note that there are two formulations of the Weibull distribution, see 'inla.doc("weibull")' for more details, default is variant = 0). Alternatively, there are two options to avoid parametric assumptions on the shape of the baseline risk: "rw1" for random walks of order one prior that corresponds to a smooth spline function based on first order differences. The second option "rw2" assigns a random walk order two prior that corresponds to a smooth spline function based on second order differences. This second option provides a smoother spline compared to order one since the smoothing is then done on the second order.

NbasRisk

the number of intervals for the baseline risk function, only one value should be provided and the same number of intervals is used for each risk submodel in case of competing risks.

cutpoints

a vector with baseline hazard cutpoints if not using equidistant (if not NULL, this replaces the NbasRisk parameter).

assoc

a character string that specifies the association between the longitudinal and survival components. The available options are "CV" for sharing the current value of the linear predictor, "CS" for the current slope, "CV_CS" for the current value and the current slope, "SRE" for shared random effects (i.e., sharing the individual deviation from the mean at time t as defined by the random effects), "SRE_ind" for shared random effect independent (each random effect's individual deviation is associated to an association parameter in the survival submodel) and "" (empty string) for no association. When there are either multiple longitudinal markers or multiple competing events, this should be a vector. In case of both multiple markers and events, it should be a list with one element per longitudinal marker and each element is a vector of association for each competing event. Keep it as NULL to have no association between longitudinal and survival components or if there is no survival component.

assocSurv

a boolean that indicates if a frailty term (i.e., random effect) from a survival model should be shared into another survival model. The order is important, the first model in the list of survival formulas ('formSurv') should include a random effect and it can be shared in the next formulas. Multiple survival models with random effects can be accomodated and a random effect can be shared in multiple survival models, following the same structure as 'assoc' (i.e., vector of booleans if one random effect is shared in multiple survival and list of vectors if multiple survival models with random effects share their random effects in multiple survival models).

corLong

a boolean that only applies when multiple longitudinal markers are fitted: should the random effects accross markers be correlated (TRUE) or independent (FALSE)? Default is FALSE.

corRE

list of the size of number of groups of random effects (i.e., equal to 1 if there is only one longitudinal marker or if corLong is TRUE and equal to the number of markers otherwise), each element is a boolean indicating if the random effects of the group must be correlated or independent (i.e., diagonal variance-covariance). Default is FALSE.

dataOnly

a boolean to only prepare the data with the correct format without running the model.

longOnly

a boolean to only prepare the data for the longitudinal part of a longitudinal-survival joint model with the correct format without running the model.

control

a list of control values that can be set with control=list(), with components:

priorFixed

list with mean and standard deviations for the Gaussian prior distribution for the fixed effects. Default is list(mean=0, prec=0.01, mean.intercept=0, prec.intercept=0.01), where mean and prec are the mean and precision (i.e., inverse of the variance) of the fixed effects, respectively and mean.intercept and prec.intercept are the corresponding parameters for the fixed intercept.

priorAssoc

list with mean and standard deviations for the Gaussian prior distribution for the association parameters (does not apply to "SRE_ind" association and shared random effect from survival models (frailty), see next item for those two). Default is list(mean=0, prec=0.01)

priorSRE_ind

list with mean and standard deviations for the Gaussian prior distribution for the association of independent random effects ("SRE_ind" and survival frailty random effects shared). Default is list(mean=0, prec=1)

priorRandom

list with prior distribution for the multivariate random effects (Inverse Wishart). Default is list(r=10, R=1), see "inla.doc("iidkd") for more details.

initVC

list of the size of number of groups of random effects giving initial values for variance-covariance of random effects, first values are variance and then covariances (as displayed in summary). All the elements of the covariance matrix must be fixed but in case of multiple groups of random effects, it is possible to fix initial values for only some groups, then elements in the list that are not initialized must be an empty string.

initSD

same as initVC but to fix standard deviations and correlations instead (only one of these two arguments can be used).

strata

list of the same size as the number of survival submodels, giving the name of covariates for stratified proportional hazards model (default is NULL).

fixRE

list of the size of number of groups of random effects, each element is a boolean indicating if the random effects of the group must be fixed or estimated.

assocInit

Initial value for all the association parameters (default is 0.1).

int.strategy

a character string giving the strategy for the numerical integration used to approximate the marginal posterior distributions of the latent field. Available options are "ccd" (default), "grid" or "eb" (empirical Bayes). The empirical Bayes uses only the mode of the approximations for the integration, which speed up and simplifies computations. It can be pictured as a tradeoff between Bayesian and frequentist estimation strategies while the default full Bayesian accounts for uncertainty by using the mode and the curvature at the mode.

Ntrials

Number of trials for binomial and Betabinomial distributions, default is NULL.

cpo

TRUE/FALSE: Default is FALSE, set to TRUE to compute the Conditional Predictive Ordinate.

cfg

TRUE/FALSE: Default is FALSE, set to TRUE to be able to sample from the posterior distribution.

safemode

TRUE/FALSE: use the INLA safe mode (automatically reruns in case of negative eigenvalue(s) in the Hessian, reruns with adjusted starting values in case of crash). Default is TRUE (activated). The message “'*** inla.core.safe“' appears when the safe mode is running, it improves the inference of the hyperparameters and can be ignored. To remove this safe mode, switch the boolean to FALSE (it can save some computation time but may return slightly less precise estimates for some hyperparameters).

rerun

TRUE/FALSE: the model reruns to improve numerical stability (default is FALSE).

tolerance

accuracy in the inner optimization (default is 0.005).

h

step-size for the hyperparameters (default is 0.005).

verbose

TRUE/FALSE: prints details of the INLA algorithm. Default is FALSE.

keep

TRUE/FALSE: keep internal files. Default is FALSE. (expert option)

Value

An object of class INLAjoint. See INLAjoint.object for details.

References

Rustand, D., van Niekerk, J., Teixeira Krainski, E., Rue, H. and Proust-Lima, C. (2023). Fast and flexible inference for joint models of multivariate longitudinal and survival data using integrated nested Laplace approximations. Biostatistics, 2023, kxad019. https://doi.org/10.1093/biostatistics/kxad019 https://arxiv.org/abs/2203.06256

Rustand, D., van Niekerk, J., Rue, H., Tournigand, C., Rondeau, V. and Briollais, L. (2023). Bayesian Estimation of Two-Part Joint Models for a Longitudinal Semicontinuous Biomarker and a Terminal Event with R-INLA: Interests for Cancer Clinical Trial Evaluation. Biometrical Journal, 65, 2100322. https://doi.org/10.1002/bimj.202100322 https://arxiv.org/abs/2010.13704

Rue, H., Martino, S. and Chopin, N. (2009). Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 71: 319-392. https://doi.org/10.1111/j.1467-9868.2008.00700.x

Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software, 67(1), 1–48. https://doi.org/10.18637/jss.v067.i01

Contact: INLAjoint@gmail.com

Examples

# joint model with 3 longitudinal / 3 competing risks of event
data(Longsim)
data(Survsim)
f1 <- function(x) x^2 # quadratic function of time for first marker
Nsplines <- splines::ns(Longsim$time, knots=2) # 2 ns splines for second marker
f2 <- function(x) predict(Nsplines, x)[,1]
f3 <- function(x) predict(Nsplines, x)[,2]

if(requireNamespace("INLA")){
 JMINLA <- joint(
  formLong = list(Y1 ~ time + f1(time) + ctsX + binX + (1 + time + f1(time) | Id),
                  Y2 ~ time + f2(time) + f3(time) + ctsX + binX + (1 | Id),
                  Y3 ~ time + ctsX + binX + (1 | Id)),
  formSurv = list(INLA::inla.surv(deathTimes, Event1) ~ binX + ctsX,
                  INLA::inla.surv(deathTimes, Event2) ~ binX,
                  INLA::inla.surv(deathTimes, Event3) ~ ctsX),
  dataLong = Longsim, dataSurv=Survsim, id = "Id", timeVar = "time", corLong=TRUE,
  family = c("gaussian", "poisson", "binomial"), basRisk = c("rw1", "rw1", "rw1"),
  assoc = list(c("CV", "CS", ""),  c("CV", "", "SRE"), c("", "CV", "")),
  control=list(int.strategy="eb"))

 summary(JMINLA)
 # 'sdcor' to switch from variance-covariance to standard
 # deviation-correlation and 'hazr' to switch parameters
 # in survival submodels from mean to hazard ratios (exp(mean)).
 summary(JMINLA, sdcor=TRUE, hazr=TRUE)
}

[Package INLAjoint version 24.3.25 Index]