R: Quantile g-computation for survival outcomes

qgcomp.cox.boot {qgcomp}

R Documentation

Quantile g-computation for survival outcomes

Description

This function yields population average effect estimates for (possibly right censored) time-to event outcomes

Usage

qgcomp.cox.boot(
  f,
  data,
  expnms = NULL,
  q = 4,
  breaks = NULL,
  id = NULL,
  weights,
  cluster = NULL,
  alpha = 0.05,
  B = 200,
  MCsize = 10000,
  degree = 1,
  seed = NULL,
  parallel = FALSE,
  parplan = FALSE,
  ...
)

Arguments

`f`	R style survival formula, which includes `Surv` in the outcome definition. E.g. `Surv(time,event) ~ exposure`. Offset terms can be included via `Surv(time,event) ~ exposure + offset(z)`
`data`	data frame
`expnms`	character vector of exposures of interest
`q`	NULL or number of quantiles used to create quantile indicator variables representing the exposure variables. If NULL, then gcomp proceeds with un-transformed version of exposures in the input datasets (useful if data are already transformed, or for performing standard g-computation)
`breaks`	(optional) NULL, or a list of (equal length) numeric vectors that characterize the minimum value of each category for which to break up the variables named in expnms. This is an alternative to using 'q' to define cutpoints.
`id`	(optional) NULL, or variable name indexing individual units of observation (only needed if analyzing data with multiple observations per id/cluster). Note that qgcomp.glm.noboot will not produce cluster-appropriate standard errors. qgcomp.glm.boot can be used for this, which will use bootstrap sampling of clusters/individuals to estimate cluster-appropriate standard errors via bootstrapping.
`weights`	"case weights" - passed to the "weight" argument of `coxph`
`cluster`	not yet implemented
`alpha`	alpha level for confidence limit calculation
`B`	integer: number of bootstrap iterations (this should typically be >=200, though it is set lower in examples to improve run-time).
`MCsize`	integer: sample size for simulation to approximate marginal hazards ratios (if < sample size, then set to sample size). Note that large values will slow down the fitting, but will result in higher accuracy - if you run the function multiple times you will see that results vary due to simulation error. Ideally, MCsize would be set such that simulation error is negligible in the precision reported (e.g. if you report results to 2 decimal places, then MCsize should be set high enough that you consistenty get answers that are the same to 2 decimal places).
`degree`	polynomial bases for marginal model (e.g. degree = 2 allows that the relationship between the whole exposure mixture and the outcome is quadratic.
`seed`	integer or NULL: random number seed for replicable bootstrap results
`parallel`	logical (default FALSE): use future package to speed up bootstrapping
`parplan`	(logical, default=FALSE) automatically set future::plan to plan(multisession) (and set to existing plan, if any, after bootstrapping)
`...`	arguments to glm (e.g. family)

Details

⁠qgcomp.cox.boot' estimates the log(hazard ratio) per quantile increase in the joint exposure to all exposures in ⁠expnms'. This function uses g-computation to estimate the parameters of a marginal structural model for the population average effect of increasing all exposures in ‘expnms’ by a single quantile. This approach involves specifying an underlying conditional outcome model, given all exposures of interest (possibly with non-linear basis function representations such as splines or product terms) and confounders or covariates of interest. This model is fit first, which is used to generate expected outcomes at each quantile of all exposures, which is then used in a second model to estimate a population average dose-response curve that is linear or follows a simple polynomial function. See section on MCSize below

Test statistics and confidence intervals are based on a non-parametric bootstrap, using the standard deviation of the bootstrap estimates to estimate the standard error. The bootstrap standard error is then used to estimate Wald-type confidence intervals. Note that no bootstrapping is done on estimated quantiles of exposure, so these are treated as fixed quantities

MCSize is crucial to get accurate point estimates. In order to get marginal estimates of the population hazard under different values of the joint exposure at a given quantile for all exposures in expnms, qgcomp.cox.boot uses Monte Carlo simulation to generate outcomes implied by the underlying conditional model and then fit a separate (marginal structural) model to those outcomes. In order to get accurate results that don't vary much from run-to-run of this approach, MCsize must be set large enough so that results are stable across runs according to a pre-determined precision (e.g. 2 significant digits).

Value

a qgcompfit object, which contains information about the effect measure of interest (psi) and associated variance (var.psi), as well as information on the model fit (fit) and information on the marginal structural model (msmfit) used to estimate the final effect estimates.

Examples

set.seed(50)
N=200
dat <- data.frame(time=(tmg <- pmin(.1,rweibull(N, 10, 0.1))), 
                d=1.0*(tmg<0.1), x1=runif(N), x2=runif(N), z=runif(N))
expnms=paste0("x", 1:2)
f = survival::Surv(time, d)~x1 + x2
(fit1 <- survival::coxph(f, data = dat))
(obj <- qgcomp.cox.noboot(f, expnms = expnms, data = dat))
## Not run: 
# not run (slow when using boot version to proper precision)
(obj2 <- qgcomp.cox.boot(f, expnms = expnms, data = dat, B=10, MCsize=20000))

# weighted analysis

# using future package, marginalizing over confounder z
(obj3 <- qgcomp.cox.boot(survival::Surv(time, d)~x1 + x2 + z, expnms = expnms, data = dat, 
                         B=1000, MCsize=20000, parallel=TRUE, parplan=TRUE))
# non-constant hazard ratio, non-linear terms
(obj4 <- qgcomp.cox.boot(survival::Surv(time, d)~factor(x1) + splines::bs(x2) + z, 
                         expnms = expnms, data = dat, 
                         B=1000, MCsize=20000, parallel=FALSE, degree=1))
                         
# weighted analysis
dat$w = runif(N)
(objw1 <- qgcomp.cox.noboot(f, expnms = expnms, data = dat, weights=w))
(objw2 <- qgcomp.cox.boot(f, expnms = expnms, data = dat, weights=w, B=5, MCsize=20000))

## End(Not run)

[Package qgcomp version 2.15.2 Index]