R: xhaz function

xhaz {xhaz}

R Documentation

xhaz function

Description

Fits the excess hazard models proposed by Esteve et al. (1990) doi:10.1002/sim.4780090506, with the possibility to account for time dependent covariates. Fits also the non-proportional excess hazard model proposed by Giorgi et al. (2005) doi:10.1002/sim.2400. In addition, fits excess hazard models with possibility to rescale (Goungounga et al. (2019) doi:10.1186/s12874-019-0747-3) or to correct the background mortality with a proportional (Touraine et al. (2020) doi:10.1177/0962280218823234) or non-proportional (Mba et al. (2020) doi:10.1186/s12874-020-01139-z) effect.

Usage

xhaz(
  formula = formula(data),
  data = sys.parent(),
  ratetable,
  rmap = list(age = NULL, sex = NULL, year = NULL),
  baseline = c("constant", "bsplines"),
  pophaz = c("classic", "rescaled", "corrected"),
  only_ehazard = FALSE,
  add.rmap = NULL,
  add.rmap.cut = list(breakpoint = FALSE, cut = NA, probs = NULL, criterion = "BIC",
    print_stepwise = FALSE),
  interval,
  ratedata = sys.parent(),
  subset,
  na.action,
  init,
  control = list(eps = 1e-04, iter.max = 800, level = 0.95),
  optim = TRUE,
  scale = 365.2425,
  trace = 0,
  speedy = FALSE,
  nghq = 12,
  method = "L-BFGS-B",
  ...
)

Arguments

`formula`	a formula object of the function with the response on the left of a `~` operator and the terms on the right. The response must be a survival object as returned by the `Surv` function (time in first and status in second).
`data`	a data frame in which to interpret the variables named in the formula
`ratetable`	a rate table stratified by age, sex, year (if missing, `ratedata` is used)
`rmap`	a list that maps data set names to the ratetable names.
`baseline`	an argument to specify the baseline hazard: if it follows a piecewise constant, `baseline = "constant"` is used and corresponds to the baseline in Esteve et al. model; if the baseline follows a quadratic b-splines, `baseline = "bsplines"` is used, corresponding to the baseline excess hazard in Giorgi et al model.
`pophaz`	indicates three possibles arguments in character: classic or rescaled and corrected. If `pophaz = "classic"` chosen, fits the model that do not require to rescale or to correct the background mortality (i.e. the Esteve et al. model or Giorgi et al. model); if `pophaz = "rescaled"` or `pophaz = "corrected"` chosen, fits the models that require to rescale or to correct the background mortality.
`only_ehazard`	a boolean argument (by default, `only_ehazard=FALSE`). If `only_ehazard = TRUE`, `pophaz = "classic"` must be provided and the total value of the log-likelihood will not account for the cumulative population hazard.
`add.rmap`	character that indicates the name in character of the additional demographic variable from `data` to be used for correction of the life table, in particular when one is in the presence of an insufficiently stratified life table (see Touraine et al. model). This argument is not used if `pophaz = "classic"` or `pophaz = "rescaled"`.
`add.rmap.cut`	a list containing arguments to specify the modeling strategy for breakpoint positions, which allows a non-proportional effect of the correction term acting on the background mortality. By default `list(breakpoint = FALSE)`, i.e. a proportional effect of the correction term acting on the background mortality is needed; in this case, all the other argument of the list are not working for the model specification; if `list(breakpoint = TRUE, cut = c(70))`, the chosen cut-point(s) is (are) the numeric value(s) proposed. If `list(breakpoint = TRUE, cut = NA)`, there is the same number of breakpoints as the number of NA, with their possible positions specified as here by `probs`, i.e. `list(breakpoint = TRUE, cut = NA, probs = seq(0, 1, 0.25))`. That corresponds to a numeric vector of probabilities with values between 0 and 1 as in `quantile` function. `criterion` is used to choose the best model, using the AIC or the BIC (the default criterion). If needed, all the fitted models are printed by the user by adding in the list `print_stepwise = FALSE`.
`interval`	a vector indicating either the location of the year-scale time intervals for models with piecewise constant function, or the location of the knots for models with B-splines functions for their baseline hazard (see the appropriate specification in `baseline` argument). The first component of the vector is 0, and the last one corresponds to the maximum time fellow-up of the study.
`ratedata`	a data frame of the hazards mortality in general population.
`subset`	an expression indicating which subset of the data should be used in the modeling. All observations are included by default
`na.action`	as in the `coxph` function, a missing-data filter function.
`init`	a list of initial values for the parameters to estimate. For each elements of the list, give the name of the covariate followed by the vector of the fixed initials values
`control`	a list of control values used to control the optimization process. In this list, `eps`, is a convergence criteria (by default, `eps=10^-4`), `iter.max` is the maximum number of iteration (by default, `iter.max=15`), and `level`, is the level used for the confidence intervals (by default, `level=0.95`).
`optim`	a Boolean argument (by default, `optim = FALSE`). If `optim = TRUE`), the maximization algorithm uses the `optim` function
`scale`	a numeric argument to specify whether the life table contains death rates per day (default `scale = 365.2425`) or death rates per year (`scale = 1`).
`trace`	a Boolean argument, if `trace = TRUE`), tracing information on the progress of the optimization is produced
`speedy`	a Boolean argument, if `speedy = TRUE`, optimization is done in a parallel mode
`nghq`	number of nodes and weights for Gaussian quadrature
`method`	corresponds to `optim` function argument.
`...`	other parameters used with the `xhaz` function

Details

Use the Surv(time_start, time_stop, status) notation for time dependent covariate with the appropriate organization of the data set (see the help page of the Surv function)

Only two interior knots are possible for the model with B-splines functions to fit the baseline (excess) hazard. Determination of the intervals might be user's defined or automatically computed according to the quantile of the distribution of deaths. Use NA for an automatic determination (for example, interval = c(0, NA, NA, 5)).

Value

An object of class constant or bsplines, according to the type of functions chosen to fit the baseline hazard of model (see details for argument baseline). This object is a list containing the following components:

`coefficients`	estimates found for the model
`varcov`	the variance-covariance matrix
`loglik`	for the Estève et al. model: the log-likelihood of the null model, i.e without covariate, and the log-likelihood of the full model, i.e with all the covariates declared in the formula; for the Giorgi et al. model: the log-likelihood of the full model
`cov.test`	for the Esteve et al.model: the log-likelihood of the null model, i.e without covariate, and the log-likelihood of the full model, i.e with all the covariates declared in the formula; for the Giorgi et al. model: the log-likelihood of the full model
`message`	a character string returned by the optimizer see details in `optim` help page
`convergence`	an integer code as in `optim` when `"L-BFGS-B"` method is used.
`n`	the number of individuals in the dataset
`n.events`	the number of events in the dataset. Event are considered as death whatever the cause
`level`	the confidence level used
`interval`	the intervals used to split time for piecewise baseline excess hazard, or knots positions for Bsplines baseline
`terms`	the representation of the terms in the model
`call`	the function `call` based on model
`pophaz`	the assumption considered for the life table used in the excess hazard model
`add.rmap`	the additional variable for which the life table is not stratified
`ehazardInt`	the cumulative hazard of each individuals calculated from the ratetable used in the model
`ehazard`	the individual expected hazard values from the ratetable used to fit the model
`data`	the dataset used to run the model
`time_elapsed`	the time to run the model

Note

time is OBLIGATORY in YEARS.

Author(s)

Juste Goungounga, Darlin Robert Mba, Nathalie Graffeo, Roch Giorgi

References

Goungounga JA, Touraine C, Graff\'eo N, Giorgi R; CENSUR working survival group. Correcting for misclassification and selection effects in estimating net survival in clinical trials. BMC Med Res Methodol. 2019 May 16;19(1):104. doi: 10.1186/s12874-019-0747-3. PMID: 31096911; PMCID: PMC6524224. (PubMed)

Touraine C, Graff\'eo N, Giorgi R; CENSUR working survival group. More accurate cancer-related excess mortality through correcting background mortality for extra variables. Stat Methods Med Res. 2020 Jan;29(1):122-136. doi: 10.1177/0962280218823234. Epub 2019 Jan 23. PMID: 30674229. (PubMed)

Mba RD, Goungounga JA, Graff\'eo N, Giorgi R; CENSUR working survival group. Correcting inaccurate background mortality in excess hazard models through breakpoints. BMC Med Res Methodol. 2020 Oct 29;20(1):268. doi: 10.1186/s12874-020-01139-z. PMID: 33121436; PMCID: PMC7596976. (PubMed)

Giorgi R, Abrahamowicz M, Quantin C, Bolard P, Esteve J, Gouvernet J, Faivre J. A relative survival regression model using B-spline functions to model non-proportional hazards. Statistics in Medicine 2003; 22: 2767-84. (PubMed)

Examples


library("numDeriv")
library("survexp.fr")
library("splines")
library("statmod")
data("simuData","rescaledData", "dataCancer")
# load the data sets 'simuData', 'rescaledData' and 'dataCancer'.

# Esteve et al. model: baseline excess hazard is a piecewise function
#                      linear and proportional effects for the covariates on
#                      baseline excess hazard.


fit.estv1 <- xhaz(formula = Surv(time_year, status) ~ agec + race,
                  data = simuData,
                  ratetable = survexp.us,
                  interval = c(0, NA, NA, NA, NA, NA, 6),
                  rmap = list(age = 'age', sex = 'sex', year = 'date'),
                  baseline = "constant", pophaz = "classic")


fit.estv1


# Touraine et al. model: baseline excess hazard is a piecewise function
#                        with a linear and proportional effects for the
#                        covariates on the baseline excess hazard.
# An additionnal cavariate (here race) missing in the life table is
# considered by the model.


fit.corrected1 <- xhaz(formula = Surv(time_year, status) ~ agec + race,
                       data = simuData,
                       ratetable = survexp.us,
                       interval = c(0, NA, NA, NA, NA, NA, 6),
                       rmap = list(age = 'age', sex = 'sex', year = 'date'),
                       baseline = "constant", pophaz = "corrected",
                       add.rmap = "race")



fit.corrected1

# extension of Touraine et al model: baseline excess hazard is a piecewise
# constant function with a linear and proportional effects for the covariates
# on the baseline excess hazard.

# An additionnal cavariate (here race) missing in the life table is
# considered by the model with a breakpoint at 75 years

fit.corrected2 <- xhaz(formula = Surv(time_year, status) ~ agec + race,
                       data = simuData,
                       ratetable = survexp.us,
                       interval = c(0, NA, NA, NA, NA, NA, 6),
                       rmap = list(age = 'age', sex = 'sex', year = 'date'),
                       baseline = "constant", pophaz = "corrected",
                       add.rmap = "race",
                        add.rmap.cut = list(breakpoint = TRUE, cut = 75))



fit.corrected2


#Giorgi et al model: baseline excess hazard is a quadratic Bsplines
#                    function with two interior knots and allow here a
#                    linear and proportional effects for the covariates on
#                    baseline excess hazard.


fitphBS <- xhaz(formula = Surv(time_year, status) ~ agec + race,
                data = simuData,
                ratetable = survexp.us,
                interval = c(0, NA, NA, 6),
                rmap = list(age = 'age', sex = 'sex', year = 'date'),
                baseline = "bsplines", pophaz = "classic")

fitphBS





# Application on `dataCancer`.
#Giorgi et al model: baseline excess hazard is a quadratic Bspline
#                    function with two interior knots and allow here a
#                    linear and proportional effect for the variable
#                    "immuno_trt" plus a non-proportional effect
#                    for the variable "ageCentre" on baseline excess hazard.


fittdphBS <- xhaz(formula = Surv(obs_time_year, event) ~ qbs(ageCentre) + immuno_trt,
                  data = dataCancer,
                  ratetable = survexp.fr,
                  interval = c(0, 0.5, 12, 15),
                  rmap = list(age = 'age', sex = 'sexx', year = 'year_date'),
                  baseline = "bsplines", pophaz = "classic")

fittdphBS




# Application on `rescaledData`.
# rescaled model: baseline excess hazard is a piecewise function with a
# linear and proportional effects for the covariates on baseline excess hazard.

# A scale parameter on the expected mortality of general population is
# considered to account for the non-comparability source of bias.

rescaledData$timeyear <- rescaledData$time/12
rescaledData$agecr <- scale(rescaledData$age, TRUE, TRUE)

fit.res <- xhaz(formula = Surv(timeyear, status) ~ agecr + hormTh,
                data = rescaledData,
                ratetable = survexp.fr,
                interval = c(0, NA, NA, NA, NA, NA, max(rescaledData$timeyear)),
                rmap = list(age = 'age', sex = 'sex', year = 'date'),
                baseline = "constant", pophaz = "rescaled")

 fit.res

[Package xhaz version 2.0.2 Index]