R: EMM for Quantile g-computation for continuous, binary, and...

qgcomp.emm.boot {qgcompint}

R Documentation

EMM for Quantile g-computation for continuous, binary, and count outcomes under linearity/additivity

Description

This function fits a quantile g-computation model, allowing effect measure modification by a binary or continuous covariate. This allows testing of statistical interaction as well as estimation of stratum specific effects. This particular implementation formally fits a marginal structural model using a Monte Carlo-based g-computation method, utilizing bootstrapping for variance estimates. Because this approach allows for non-linear/non-additive effects of exposures, it does not report weights nor EMM stratum specific effects.

Usage

qgcomp.emm.boot(
  f,
  data,
  expnms = NULL,
  emmvar = "",
  q = 4,
  breaks = NULL,
  id = NULL,
  weights,
  alpha = 0.05,
  B = 200,
  rr = TRUE,
  degree = 1,
  seed = NULL,
  bayes = FALSE,
  MCsize = nrow(data),
  parallel = FALSE,
  parplan = FALSE,
  errcheck = FALSE,
  ...
)

Arguments

`f`	R style formula
`data`	data frame
`expnms`	character vector of exposures of interest
`emmvar`	(character) name of effect measure modifier in dataset (if categorical, must be coded as a factor variable)
`q`	NULL or number of quantiles used to create quantile indicator variables representing the exposure variables. If NULL, then gcomp proceeds with un-transformed version of exposures in the input datasets (useful if data are already transformed, or for performing standard g-computation)
`breaks`	(optional) NULL, or a list of (equal length) numeric vectors that characterize the minimum value of each category for which to break up the variables named in expnms. This is an alternative to using 'q' to define cutpoints.
`id`	(optional) NULL, or variable name indexing individual units of observation (only needed if analyzing data with multiple observations per id/cluster). Note that qgcomp.emm.noboot will not produce cluster-appropriate standard errors (this parameter is essentially ignored in qgcomp.emm.noboot). Qgcomp.emm.boot can be used for this, which will use bootstrap sampling of clusters/individuals to estimate cluster-appropriate standard errors via bootstrapping.
`weights`	"case weights" - passed to the "weight" argument of `glm` or `bayesglm`
`alpha`	alpha level for confidence limit calculation
`B`	integer: number of bootstrap iterations (this should typically be >=200, though it is set lower in examples to improve run-time).
`rr`	logical: if using binary outcome and rr=TRUE, qgcomp.boot will estimate risk ratio rather than odds ratio
`degree`	polynomial bases for marginal model (e.g. degree = 2 allows that the relationship between the whole exposure mixture and the outcome is quadratic (default = 1).
`seed`	integer or NULL: random number seed for replicable bootstrap results
`bayes`	use underlying Bayesian model (`arm` package defaults). Results in penalized parameter estimation that can help with very highly correlated exposures. Note: this does not lead to fully Bayesian inference in general, so results should be interpreted as frequentist.
`MCsize`	integer: sample size for simulation to approximate marginal zero inflated model parameters. This can be left small for testing, but should be as large as needed to reduce simulation error to an acceptable magnitude (can compare psi coefficients for linear fits with qgcomp.noboot to gain some intuition for the level of expected simulation error at a given value of MCsize). This likely won't matter much in linear models, but may be important with binary or count outcomes.
`parallel`	use (safe) parallel processing from the future and future.apply packages
`parplan`	(logical, default=FALSE) automatically set future::plan to plan(multisession) (and set to existing plan after bootstrapping)
`errcheck`	(logical, default=TRUE) include some basic error checking. Slightly faster if set to false (but be sure you understand risks)
`...`	arguments to glm (e.g. family)

Value

a qgcompfit object, which contains information about the effect measure of interest (psi) and associated variance (var.psi), as well as information on the model fit (fit) and information on the weights/standardized coefficients in the positive (pos.weights) and negative (neg.weights) directions.

Examples

set.seed(50)
# linear model, binary modifier
dat <- data.frame(y=runif(50), x1=runif(50), x2=runif(50),
  z=rbinom(50,1,0.5), r=rbinom(50,1,0.5))
(qfit <- qgcomp.emm.noboot(f=y ~ z + x1 + x2, emmvar="z",
  expnms = c('x1', 'x2'), data=dat, q=4, family=gaussian()))
# set B larger for real examples
(qfit2 <- qgcomp.emm.boot(f=y ~ z + x1 + x2, emmvar="z",
  degree = 1,
  expnms = c('x1', 'x2'), data=dat, q=4, family=gaussian(), B=10))
# categorical modifier
dat2 <- data.frame(y=runif(50), x1=runif(50), x2=runif(50),
  z=sample(0:2, 50,replace=TRUE), r=rbinom(50,1,0.5))
dat2$z = as.factor(dat2$z)
(qfit3 <- qgcomp.emm.noboot(f=y ~ z + x1 + x2, emmvar="z",
  expnms = c('x1', 'x2'), data=dat2, q=4, family=gaussian()))
# set B larger for real examples
(qfit4 <- qgcomp.emm.boot(f=y ~ z + x1 + x2, emmvar="z",
  degree = 1,
  expnms = c('x1', 'x2'), data=dat2, q=4, family=gaussian(), B=10))