R: mexhazLT function

mexhazLT {xhaz}

R Documentation

mexhazLT function

Description

Extends excess hazard models from the mexhaz R-package to allow rescaling (Goungounga et al. (2019) doi:10.1186/s12874-019-0747-3) of the background mortality in the presence or absence of multilevel data (Goungounga et al. (2023) <doi: 10.1002/bimj.202100210>). It allows for different shapes of the baseline hazard, the ability to include time-dependent effects of variable(s), and a random effect at the cluster level.

Usage

mexhazLT(
  formula,
  data,
  expected = "expected",
  expectedCum = "expectedCum",
  pophaz = "classic",
  base = c("weibull", "exp.bs", "exp.ns", "pw.cst"),
  degree = 3,
  knots = NULL,
  bound = NULL,
  n.gleg = 20,
  init = NULL,
  random = NULL,
  n.aghq = 10,
  fnoptim = c("nlm", "optim"),
  verbose = 0,
  method = "Nelder-Mead",
  iterlim = 10000,
  numHess = FALSE,
  print.level = 1,
  exactGradHess = TRUE,
  gradtol = ifelse(exactGradHess, 1e-08, 1e-06),
  testInit = TRUE,
  keep.data = FALSE,
  ...
)

Arguments

`formula`	a formula object of the function with the response on the left of a `~` operator and the terms on the right. The response must be a survival object as returned by the `Surv` function (time in first and status in second).
`data`	a data frame in which the variables named in the formula are to be interpreted.
`expected`	name of the variable (must be given in quotes) representing the population instantaneous hazard.
`expectedCum`	name of the variable (must be given in quotes) representing the population cumulative hazard.
`pophaz`	specifies two possible arguments in character: classic and rescaled. If `pophaz = "classic"` is chosen, it fits the models that do not require the background mortality to be rescaled and assumes that the comparability assumption holds; if `pophaz = "rescaled"` is chosen, it fits the models that require that require the background mortality to be rescaled.
`base`	functional form that should be used to model the baseline hazard. Selection can be made between the following options: `"weibull"` for a Weibull hazard, `"exp.bs"` for a hazard described by the exponential of a B`-`spline (only B`-`splines of degree 1, 2 or 3 are accepted), `"exp.ns"` for a hazard described by the exponential of a restricted cubic spline (also called 'natural spline'), `"pw.cst"` for a piecewise constant hazard. By default, base="weibull" as in mexhaz R`-`package.
`degree`	if `base="exp.bs"`, degree represents the degree of the B`-`spline used. Only integer values between 1 and 3 are accepted, and 3 is the default.
`knots`	if `base="exp.bs"` or `"exp.ns"`, knots is the vector of interior knots of the spline. If `base="pw.cst"`, knots is the vector defining the endpoints of the time intervals on which the hazard is assumed to be constant. By default, `knots=NULL` (that is, it produces a B`-`spline with no interior knots if base="exp.bs", a linear B`-`spline with no interior knots if base="exp.ns", or a constant hazard over the whole follow`-`up period if `base="pw.cst"`).
`bound`	a vector of two numerical values corresponding to the boundary knots of the spline functions. If `base="exp.bs"` or `base="exp.ns"`, computation of the B-spline basis requires that boundary knots be given. The bound argument allows the user to specify these boundary knots. If `base="exp.bs"`, the interval defined by the boundary knots must at least include the interval `c(0,max(time))` (otherwise, there could be problems with ill`-`conditioned bases). If `base="exp.ns"`,
`n.gleg`	corresponds to the number of quadrature nodes to be specified as in `mexhaz`.
`init`	vector of initial values as in `mexhaz`.
`random`	name of the variable to be entered as a random effect (must be given between quotes), representing the cluster membership. As in `mexhaz` `random=NULL` means that the function fits a fixed effects model.
`n.aghq`	corresponds to the number of quadrature points to be specified as in `mexhaz` for the estimation of the cluster`-`specific marginal likelihoods by adaptative Gauss`-`Hermite quadrature.
`fnoptim`	name of the R optimisation procedure used to maximise the likelihood. Selection can be made between "nlm" (by default) and "optim". Note: if `exactGradHess=TRUE`, this argument will be ignored (fnoptim will be set automatically to `"nlm"`).
`verbose`	integer parameter representing the frequency at which the current state of the optimisation process is displayed. If verbose=0 (default), nothing is displayed.
`method`	if fnoptim="optim", method represents the optimisation method to be used by optim. By default, `method="Nelder-Mead"`. This parameter is not used if `fnoptim="nlm"`.
`iterlim`	if `fnoptim="nlm"`, iterlim represents the maximum number of iterations before the nlm optimisation procedure is terminated. By default, iterlim is set to 10000. This parameter is not used if `fnoptim="optim"` (in this case, the maximum number of iterations must be given as part of a list of control parameters via the control argument: see the help page of optim for further details).
`numHess`	logical value allowing the user to choose between the Hessian returned by the optimization algorithm (default) or the Hessian estimated by the hessian function from the `numDeriv` package.
`print.level`	his argument is only used if `fnoptim="nlm"`. It determines the level of printing during the optimisation process. The default value (for the mexhaz function) is set to `'1'` which means that details on the initial and final step of the optimisation procedure are printed (see the help page of nlm for further details).
`exactGradHess`	logical value allowing the user to decide whether maximisation of the likelihood should be based on the analytic gradient and Hessian computed internally (default, corresponding to `exactGradHess=TRUE`).
`gradtol`	this argument is only used if `fnoptim="nlm"`. It corresponds to the tolerance at which the scaled gradient is considered close enough to zero to terminate the algorithm. The default value depends on the value of the argument `exactGradHess`.
`testInit`	this argument is used only when `exactGradHess=TRUE` and when the model is not an excess hazard random effect model. It instructs the mexhaz function to try several vectors of initial values in case optimization was not successful with the default (or user-defined) initial values. Because optimization based on the analytical gradient and Hessian is usually fast, this simple and empirical procedure proves useful to increase the probability of convergence in cases when it is difficult to specify appropriate initial values.
`keep.data`	logical argument determining whether the dataset should be kept in the object returned by the function: this can be useful in certain contexts (e.g., to calculate cluster`-`specific posterior predictions from a random intercept model) but might create unnecessarily voluminous objects. The default value is set to `FALSE`.
`...`	other parameters used with the `mexhazLT` function

Value

An object of class mexhaz, xhaz or mexhazLT. This object is a list containing the following components:

`dataset`	name of the dataset used to fit the model.
`call`	function call on which the model is based.
`formula`	formula part of the call.
`withAlpha`	logical value indicating whether the model corresponds to a class of models correcting for life tables.
`expected`	name of the variable corresponding to the population hazard.
`expectedCum`	name of the variable corresponding to the cumulative population hazard.
`xlevels`	information concerning the levels of the categorical variables used in the model.
`n.obs.tot`	total number of observations in the dataset.
`n.obs`	number of observations used to fit the model (after exclusion of missing values).
`n.events`	number of events (after exclusion of missing values).
`n.clust`	number of clusters.
`n.time.0`	number of observations for which the observed follow-up time was equal to 0 (only for right censored type data).
`base`	function used to model the baseline hazard.
`max.time`	maximal observed time in the dataset.
`boundary.knots`	vector of boundary values used to define the B`-`spline (or natural spline) bases.
`degree`	degree of the B`-`spline used to model the logarithm of the baseline hazard.
`knots`	vector of interior knots used to define the B`-`spline (or natural spline) bases.
`names.ph`	names of the covariables with a proportional effect.
`random`	name of the variable defining cluster membership (set to NA in the case of a purely fixed effects model).
`init`	a vector containing the initial values of the parameters.
`coefficients`	a vector containing the parameter estimates.
`std.errors`	a vector containing the standard errors of the parameter estimates.
`vcov`	the variance-covariance matrix of the estimated parameters.
`gradient`	the gradient of the log`-`likelihood function evaluated at the estimated parameters.
`hessian`	the Hessian of the log`-`likelihood function evaluated at the estimated parameters.
`mu.hat`	a data.frame containing the estimated cluster`-`specific random effects (shrinkage estimators).
`var.mu.hat`	the covariance matrix of the cluster`-`specific shrinkage estimators.
`vcov.fix.mu.hat`	a matrix containing the covariances between the fixed effect and the cluster`-`specific shrinkage estimators. More specifically, the i`-`th line of the matrix represents the covariances between the shrinkage estimator of the i`-`th cluster and the fixed effect estimates. This matrix is used by the function `predict.mexhaz` to make cluster`-`specific predictions.
`data`	original dataset used to fit the model (if `keep.data` was set to `TRUE`).
`n.par`	number of estimated parameters.
`n.gleg`	number of Gauss`-`Legendre quadrature points used to calculate the cumulative (excess) hazard (only relevant if a B-spline of degree 2 or 3 or a cubic restricted spline was used to model the logarithm of the baseline hazard).
`n.aghq`	number of adaptive Gauss`-`Hermite quadrature points used to calculate the cluster-specific marginal likelihoods (only relevant if a multi-level model is fitted).
`fnoptim`	name of the R optimisation procedure used to maximise the likelihood.
`method`	optimisation method used by optim.
`code`	code (integer) indicating the status of the optimisation process (this code has a different meaning for nlm and for optim: see their respective help page for details).
`loglik`	value of the log`-`likelihood at the end of the optimisation procedure. Note that this is different to that calculated in mexhaz as the cumulative expected hazard cannot be removed from the log`-`likelihood.
`iter`	number of iterations used in the optimisation process.
`eval`	number of evaluations used in the optimisation process.
`time.elapsed`	total time required to reach convergence.

Note

time is OBLIGATORY in YEARS.

Author(s)

Juste Goungounga, Hadrien Charvat, Nathalie Graffeo, Roch Giorgi

References

Goungounga JA, Touraine C, Graff\'eo N, Giorgi R; CENSUR working survival group. Correcting for misclassification and selection effects in estimating net survival in clinical trials. BMC Med Res Methodol. 2019 May 16;19(1):104. doi: 10.1186/s12874-019-0747-3. PMID: 31096911; PMCID: PMC6524224. (PubMed)

Goungounga, JA, Graff\'eo N, Charvat H, Giorgi R. “Correcting for heterogeneity and non-comparability bias in multicenter clinical trials with a rescaled random-effect excess hazard model.” Biometrical journal. Biometrische Zeitschrift vol. 65,4 (2023): e2100210. doi:10.1002/bimj.202100210.PMID: 36890623; (PubMed)

Examples


library("numDeriv")
library("survexp.fr")
library("splines")
library("statmod")
data("breast")
# load the data sets 'breast'.

 # Flexible mexhaz model: baseline excess hazard with cubic B-splines
 # assumption on the life table available :
 # other cause mortality in the cohort is comparable to the mortality
 # observed in the general population with the same characteristics.

# The life table to be used is survexp.us. Note that SEX is coded 2 instead of female in survexp.us.
breast$sexe <- "female"

fit.haz <- exphaz(
                  formula = Surv(temps, statut) ~ 1,
                  data = breast, ratetable = survexp.us,
                  only_ehazard = FALSE,
                  rmap = list(age = 'age', sex = 'sexe', year = 'date'))

breast$expected <- fit.haz$ehazard
breast$expectedCum <- fit.haz$ehazardInt

mod.bs <- mexhazLT(formula = Surv(temps, statut) ~ agecr + armt,
                  data = breast,
                  ratetable = survexp.us, degree = 3,
                  knots=quantile(breast[breast$statut==1,]$temps, probs=c(1:2/3)),
                  expected = "expected",expectedCum = "expectedCum",
                  base = "exp.bs", pophaz = "classic")

mod.bs


 # Flexible mexhaz model: baseline excess hazard with cubic B-splines
 # assumption on the life table available :
 # other cause mortality in the cohort is different to the mortality
 # observed in the general population with the same characteristics.

mod.bs2 <- mexhazLT(formula = Surv(temps, statut) ~ agecr + armt,
                  data = breast, degree = 3,
                  knots=quantile(breast[breast$statut==1,]$temps, probs=c(1:2/3)),
                  expected = "expected",expectedCum = "expectedCum",
                  base = "exp.bs", pophaz = "rescaled")

mod.bs2


 # Flexible mexhaz model with a random effects at cluster level:
 # baseline excess hazard with cubic B-splines
 # assumption on the life table used :
 # other cause mortality in the cohort is different to the mortality
 # observed in the general population with the same characteristics.

mod.bs3 <- mexhazLT(formula = Surv(temps, statut) ~ agecr + armt,
                  data = breast, degree = 3,
                  knots=quantile(breast[breast$statut==1,]$temps, probs=c(1:2/3)),
                  expected = "expected",expectedCum = "expectedCum",
                  base = "exp.bs", pophaz = "rescaled", random = "hosp")

mod.bs3

[Package xhaz version 2.0.2 Index]