R: Fit Relative Survival Model

flexrsurv {flexrsurv}

R Documentation

Fit Relative Survival Model

Description

flexrsurv is used to fit relative survival regression model. Time dependent variables, non-proportionnal (time dependent) effects, non-linear effects are implemented using Splines (B-spline and truncated power basis). Simultaneously non linear and non proportional effects are implemented using approaches developed by Remontet et al.(2007) and Mahboubi et al. (2011).

Usage

flexrsurv(formula=formula(data),
   data=parent.frame(), 
   knots.Bh,
   degree.Bh=3,
   Spline=c("b-spline", "tp-spline", "tpi-spline"), 
   log.Bh=FALSE,
   bhlink=c("log", "identity"),
   Min_T=0,
   Max_T=NULL,
   model=c("additive","multiplicative"),
   rate=NULL, 
   weights=NULL,
   na.action=NULL,
   int_meth=c("GL", "CAV_SIM", "SIM_3_8", "BOOLE", "BANDS"),
   npoints=20,   
   stept=NULL,              
   bands=NULL,
   init=NULL,
   initbyglm=TRUE,
   initbands=bands,
   optim.control=list(trace=100, REPORT=1, fnscale=-1, maxit=25), 
   optim_meth=c("BFGS", "CG", "Nelder-Mead", "L-BFGS-B", "SANN", "Brent"),
   control.glm=list(epsilon=1e-8, maxit=100, trace=FALSE, epsilon.glm=1e-1, maxit.glm=25),
   vartype =  c("oim", "opg", "none"),
   debug=FALSE
   )


flexrsurv.ll(formula=formula(data), 
   data=parent.frame(), 
   knots.Bh=NULL,   
   degree.Bh=3,
   Spline=c("b-spline", "tp-spline", "tpi-spline"), 
   log.Bh=FALSE,
   bhlink=c("log", "identity"),
   Min_T=0,
   Max_T=NULL,
   model=c("additive","multiplicative"),
   rate=NULL, 
   weights=NULL,
   na.action=NULL, 
   int_meth=c("GL", "CAV_SIM", "SIM_3_8", "BOOLE", "GLM", "BANDS"),
   npoints=20,   
   stept=NULL,
   bands=NULL,
   init=NULL,
   optim.control=list(trace=100, REPORT=1, fnscale=-1, maxit=25), 
   optim_meth=c("BFGS", "CG", "Nelder-Mead", "L-BFGS-B", "SANN", "Brent"),
   vartype =  c("oim", "opg", "none"),
   debug=FALSE
   )

Arguments

`formula`	a formula object, with the response on the left of a ~ operator, and the terms on the right. The response must be a survival object as returned by the `Surv` function.
`data`	a data.frame in which to interpret the variables named in the formula.
`knots.Bh`	the internal breakpoints that define the spline used to estimate the baseline hazard. Typical values are the mean or median for one knot, quantiles for more knots.
`degree.Bh`	degree of the piecewise polynomial of the baseline hazard. Default is 3 for cubic splines.
`Spline`	a character string specifying the type of spline basis. "b-spline" for B-spline basis, "tp-spline" for truncated power basis and "tpi-spline" for monotone (increasing) truncated power basis.
`log.Bh`	logical value: if TRUE, an additional basis equal to log(time) is added to the spline bases of time.
`bhlink`	logical value: if TRUE, log of baseline hazard is modelled, if FALSE, the baseline hazard is out of the log.
`Min_T`	minimum of time period which is analysed. Default is `max(0.0, min(bands) )`.
`Max_T`	maximum of time period which is analysed. Default is `max(c(bands, timevar))`
`model`	character string specifying the type of model for both non-proportionnal and non linear effects. The model `method=="additive"` assumes effects as explained in Remontet et al.(2007), the model `method=="multiplicative"` assumes effects as explained in Mahboubi et al. (2011).
`rate`	an optional vector of the background rate for a relevant comparative population to be used in the fitting process. Should be a numeric vector (for relative survival model). `rate` is evaluated in the same way as variables in `formula`, that is first in `data` and then in the environment of `formula`.
`weights`	an optional vector of weights to be used in the fitting process. Should be NULL or a numeric vector. If not null, the total likelihood is the weighted sum of individual likelihood.
`na.action`	a missing-data filter function, applied to the model.frame, after any subset argument has been used. Default is options()$na.action.
`int_meth`	character string specifying the the numerical integration method. Possible values are "GL" for Gauss-Legendre quadrature, "CAV_SIM" for Cavalieri-Simpson's rule, "SIM_3_8" for the Simpson's 3/8 rule, "BOOLE" for the Boole's rule, or "BANDS" for the midpoint rule with specified bands.
`npoints`	number of points used in the Gauss-Legendre quadrature (when `int_meth="GL"`).
`stept`	scalar value of the time-step in numerical integration. It is required only when `int_meth="CAV_SIM"` or `"SIM_3_8"` or `"BOOLE"`. If no value is supplied, `Max_T/500` is used.
`bands`	bands used to split data in the numerical integration when `int_meth="BANDS"`.
`init`	starting values of the parameters.
`initbyglm`	a logical value indicating indicating how are found or refined init values. If TRUE, the fitting method described in Remontet et al.(2007) is ued to find or refine starting values. This may speedup the fit. If FALSE, the maximisation of the likelihood starts at values given in `init`. If `init=NULL`, the starting values correspond to a constant net hazard equal to the ratio of the number of event over the total number of person-time.
`initbands`	bands used to split data when `initbyglm=TRUE`.
`optim.control`	a list of control parameters passed to the `optim()` function.
`optim_meth`	method to be used to optimize the likelihood. See `optim`.
`control.glm`	a list of control parameters passed to the `glm()` function when `method="glm"`.
`vartype`	character string specifying the type of variance matrix computed by `flexrsurv`: the inverse of the hessian matrix computed at the MLE estimate (ie. the inverse of the observed information matrix) if `vartype="oim"`, the inverse of the outer product of the gradients if `vartype="opg"`. The variance is not computed when `vartype="none"`.
`debug`	control the volum of intermediate output

Details

A full description of the additive and the multiplicative both non-linear and non-proportional models is given respectively in Remontet (2007) and Mahboubi (2011).

flexrsurv.ll is the workhorse function: it is not normally called directly.

Value

flexrsurv returns an object of class "flexrsurv". An object of class "flexrsurv" is a list containing at least the following components:

`coefficients`	a named vector of coefficients
`loglik`	the log-likelihood
`var`	estimated covariance matrix for the estimated coefficients
`informationMatrix`	estimated information matrix
`bhlink`	the linkk of baseline hazard: if `"identity"` baseline = sum g0_i b_i(t); if `"log"` log(baseline) = sum g0_i b_i(t);
`init`	vector of the starting values supplied
`converged`	logical, Was the optimlizer algorithm judged to have converged?
`linear.predictors`	the linear fit on link scale (not including the baseline hazard term if `bhlink = "identity"`)
`fitted.values`	the estimated value of the hazard rate at each event time, obtained by transforming the linear predictors by the inverse of the link function
`cumulative.hazard`	the estimated value of the cumulative hazard in the time interval
`call`	the matched call
`formula`	the formula supplied
`terms`	the `terms` object used
`data`	the `data` argument
`rate`	the rate vector used
`time`	the time vector used
`workingformula`	the formula used by the fitter
`optim.control`	the value of the `optim.control` argument supplied
`control.glm`	the value of the `control.glm` argument supplied
`method`	the name of the fitter function used

References

Mahboubi, A., M. Abrahamowicz, et al. (2011). "Flexible modeling of the effects of continuous prognostic factors in relative survival." Stat Med 30(12): 1351-1365. doi:10.1002/sim.4208

Remontet, L., N. Bossard, et al. (2007). "An overall strategy based on regression models to estimate relative survival and model the effects of prognostic factors in cancer survival studies." Stat Med 26(10): 2214-2228. doi:10.1002/sim.2656

Examples




if (requireNamespace("relsurv", quietly = TRUE)) {

	# data from package relsurv
	data(rdata, package="relsurv")
	
	# rate table from package relsurv
	data(slopop, package="relsurv")
	
	
	# get the death rate at event (or end of followup) from slopop for rdata
	rdata$iage <- findInterval(rdata$age*365.24+rdata$time, attr(slopop, "cutpoints")[[1]])
	rdata$iyear <- findInterval(rdata$year+rdata$time, attr(slopop, "cutpoints")[[2]])
	therate <- rep(-1, dim(rdata)[1])
	for( i in 1:dim(rdata)[1]){
	  therate[i] <- slopop[rdata$iage[i], rdata$iyear[i], rdata$sex[i]]
	}
	
	rdata$slorate <- therate
	
	# change sex coding
	rdata$sex01 <- rdata$sex -1
	
	# fit a relative survival model with a non linear effect of age
	fit <- flexrsurv(Surv(time,cens)~sex01+NLL(age, Knots=60, Degree=3,
	                                           Boundary.knots = c(24, 95)), 
	                 rate=slorate, data=rdata,
	                 knots.Bh=1850,  # one interior knot at 5 years
	                 degree.Bh=3,
	                 Max_T=5400,
	                 Spline = "b-spline",
	                 initbyglm=TRUE,
	                 initbands=seq(0, 5400, 100), 
	                 int_meth= "BANDS",
	                 bands=seq(0, 5400, 50)
	                 )
	summary(fit)
	
	# fit a relative survival model with a non linear & non proportional effect of age
	fit2 <- flexrsurv(Surv(time,cens)~sex01+NPHNLL(age, time, Knots=60,
	                                               Degree=3,
	                                               Knots.t = 1850, Degree.t = 3), 
	                 rate=slorate, data=rdata,
	                 knots.Bh=1850,  # one interior knot at 5 years
	                 degree.Bh=3,
	                 Spline = "b-spline",
	                 initbyglm=TRUE, 
	                 int_meth= "BOOLE",
	                 step=50
	                 )
	summary(fit2, correlation=TRUE)
	
}

[Package flexrsurv version 2.0.18 Index]