simJM {bamlss} | R Documentation |

## Simulate longitudinal and survival data for joint models

### Description

Simulates longitudinal data with normal error and (Cox-type) survival times
using the inversion method. The function `simJM()`

is a wrapper specifying
all predictors and the resulting data sets. The wrapper calls `rJM()`

to sample
the survival times, a modified version of `rSurvtime()`

from the **R** package
CoxFlexBoost.

### Usage

```
simJM(nsub = 300, times = seq(0, 120, 1), probmiss = 0.75,
long_setting = "functional",
alpha_setting = if(nonlinear) "linear" else "nonlinear",
dalpha_setting = "zero", sigma = 0.3, long_df = 6, tmax = NULL,
seed = NULL, full = FALSE, file = NULL, nonlinear = FALSE,
fac = FALSE)
rJM(hazard, censoring, x, r,
subdivisions = 1000, tmin = 0, tmax,
file = NULL, ...)
```

### Arguments

`nsub` |
number of individuals for which longitudinal data and survival times should be simulated. |

`times` |
vector of time points at which longitudinal measurements are "sampled". |

`probmiss` |
proportion of longitudinal measurements to be set to missing. Used to induce sparsity in the longitudinal measurements. |

`long_setting` |
Specification of the longitudinal trajectories of the sampled subjects.
Preset specifications are |

`alpha_setting` |
specification of the association between survival and longitudinal. Preset
specifications are |

`dalpha_setting` |
specification of the association between survival and the derivative of the longitudinal. Work in progress. |

`sigma` |
standard deviation of the normal error around the true longitudinal measurements. |

`long_df` |
number of basis functions from which functional random intercepts are sampled. |

`tmax` |
For function |

`seed` |
numeric scalar setting the random seed. |

`full` |
logical indicating if only the longitudinal data set should be returned ( |

`file` |
name of the data file the generated data set should be stored into
(e.g., "simdata.RData") or NULL if the dataset should directly be returned in |

`nonlinear` |
If set to |

`fac` |
If set to |

`hazard` |
complete hazard function to specify the joint model. Time must be the first argument. |

`censoring` |
function to compute (random) censoring. |

`x` |
matrix of sampled covariate values. |

`r` |
matrix of sampled random coefficients. |

`subdivisions` |
the maximum number of subintervals for the integration. |

`tmin` |
earliest time point to sample a survival time. |

`...` |
further arguments to be passed to |

### Details

The function simulates longitudinal data basing on the given specification at given `times`

.
The full hazard is built from all joint model predictors `\eta_{\mu}`

, `\eta_{\sigma}`

,
`\eta_{\lambda}`

, `\eta_{\gamma}`

, `\eta_{\alpha}`

as presented in
Koehler, Umlauf, and Greven (2016), see also `jm_bamlss`

. Survival times are sampled using the inversion
method (cf. Bender, Augustin, & Blettner, 2005). Additional censoring and missingness is
introduced. The longitudinal information is censored according to the survival information. The
user can also specify own predictors and use only `rJM`

to simulate survival times
accordingly.

Pre-specified functions for `\eta_{\mu}`

in `long_setting`

are for `linear`

`\eta_{\mu i}(t) = 1.25 + r_{1i} + 0.6 \sin(x_{2i}) + (-0.01) t + 0.02 r_{2i} t`

,
for `nonlinear`

`\eta_{\mu i}(t) = 0.5 + r_{1i} + 0.6 \sin(x_{2i}) + 0.1 (t+1) \exp(-0.075 t)`

and for `functional`

`\eta_{\mu i}(t) = 0.5 + r_{1i} + 0.6 \sin(x_{2i}) + 0.1 (t+1) \exp(-0.075 t) + \sum_k \beta_{ki} B(t)`

,
where `B(.)`

denotes a B-spline basis function and `\beta_{ki}`

are the sampled penalized
coefficients from `gen_b`

per person.

Prespecified functions for `\eta_{\alpha}`

in `alpha_setting`

are for `constant`

`\eta_{\alpha}(t) = 1`

, for `linear`

`\eta_{\alpha}(t) = 1 - 0.015 t`

, for
`nonlinear`

`\eta_{\alpha}(t) = \cos((time-20)/20)`

, and for `nonlinear`

`\eta_{\alpha}(t) = \cos((time-33)/33)`

.

Additionally the fixed functions for `\eta_{\lambda} = 0.1(t+2)\exp(-0.075t)`

and
`\eta_{\lambda} = 0.1(t+2)\exp(-0.075t)`

are employed.

### Value

For `full = TRUE`

a list of the three `data.frame`

s is returned:

`data` |
Simulated dataset in long format including all longitudinal and survival covariates. |

`data_grid` |
Dataset of the time-varying survival predictors which are not subject specific, evaluated at a grid of fixed time points. |

`data_full` |
Simulated data set prior to generating longitudinal missings. Useful to assess the longitudinal fit. |

For `full = FALSE`

only the first dataset is returned.

Covariates within these datasets include a subject identifier `id`

, the sampled survival
times `survtime`

, the event indicator `event`

, the time points of longitudinally
"observed" measurements `obstime`

, the longitudinal response `y`

, the cumulative
hazard at the survival time `cumhaz`

, as well as covariates `x1, x2`

, random effects
`r1, r2, b1, ...`

, and the true predictors `alpha, lambda, gamma, mu, sigma`

.

### References

Hofner, B (2016). CoxFlexBoost: Boosting Flexible Cox Models
(with Time-Varying Effects). **R** package version 0.7-0.

Bender, R., Augustin, T., and Blettner, M. (2005).
Generating Survival Times to Simulate Cox Proportional Hazards Models.
*Statistics in Medicine*, **24**, 1713-1723.

Koehler N, Umlauf N, Beyerlein, A., Winkler, C., Ziegler, A., and Greven S (2016). Flexible Bayesian Additive Joint Models with an
Application to Type 1 Diabetes Research. *(submitted)*

### See Also

`jm_bamlss`

, `opt_JM`

, `sam_JM`

, `bamlss`

.

### Examples

```
## Not run: ## Simulate survival data
## with functional random intercepts and a nonlinear effect
## of time, time-varying association alpha.
d <- simJM(nsub = 300)
head(d)
## Simulate survival data
## with random intercepts/slopes and a linear effect of time,
## constant association alpha.
d <- simJM(nsub = 200, long_setting = "linear",
alpha_setting = "constant")
head(d)
## End(Not run)
```

*bamlss*version 1.2-3 Index]