R: Gamma Discriminant Function Approach for Estimating Odds...

p_gdfa {pooling}

R Documentation

Gamma Discriminant Function Approach for Estimating Odds Ratio with Exposure Measured in Pools and Potentially Subject to Multiplicative Lognormal Errors

Description

Assumes exposure given covariates and outcome is a constant-scale Gamma regression. Pooled exposure measurements can be assumed precise or subject to multiplicative lognormal processing error and/or measurement error. Parameters are estimated using maximum likelihood.

Usage

p_gdfa(g, y, xtilde, c = NULL, constant_or = TRUE,
  errors = "processing", estimate_var = TRUE,
  start_nonvar_var = c(0.01, 1), lower_nonvar_var = c(-Inf, 1e-04),
  upper_nonvar_var = c(Inf, Inf), jitter_start = 0.01,
  hcubature_list = list(tol = 1e-08), nlminb_list = list(control =
  list(trace = 1, eval.max = 500, iter.max = 500)),
  hessian_list = list(method.args = list(r = 4)), nlminb_object = NULL)

Arguments

`g`	Numeric vector with pool sizes, i.e. number of members in each pool.
`y`	Numeric vector with poolwise Y values, coded 0 if all members are controls and 1 if all members are cases.
`xtilde`	Numeric vector (or list of numeric vectors, if some pools have replicates) with Xtilde values.
`c`	List where each element is a numeric matrix containing the C values for members of a particular pool (1 row for each member).
`constant_or`	Logical value for whether to assume a constant OR for X, which means that gamma_y = 0. If `NULL`, model is fit with and without this assumption, and a likelihood ratio test is performed to test it.
`errors`	Character string specifying the errors that `X` is subject to. Choices are `"neither"`, `"processing"` for processing error only, `"measurement"` for measurement error only, and `"both"`.
`estimate_var`	Logical value for whether to return variance-covariance matrix for parameter estimates.
`start_nonvar_var`	Numeric vector of length 2 specifying starting value for non-variance terms and variance terms, respectively.
`lower_nonvar_var`	Numeric vector of length 2 specifying lower bound for non-variance terms and variance terms, respectively.
`upper_nonvar_var`	Numeric vector of length 2 specifying upper bound for non-variance terms and variance terms, respectively.
`jitter_start`	Numeric value specifying standard deviation for mean-0 normal jitters to add to starting values for a second try at maximizing the log-likelihood, should the initial call to `nlminb` result in non-convergence. Set to `NULL` for no second try.
`hcubature_list`	List of arguments to pass to `hcubature` for numerical integration.
`nlminb_list`	List of arguments to pass to `nlminb` for log-likelihood maximization.
`hessian_list`	List of arguments to pass to `hessian` for approximating the Hessian matrix. Only used if `estimate_var = TRUE`.
`nlminb_object`	Object returned from `nlminb` in a prior call. Useful for bypassing log-likelihood maximization if you just want to re-estimate the Hessian matrix with different options.

Value

List containing:

Numeric vector of parameter estimates.
Variance-covariance matrix.
Returned nlminb object from maximizing the log-likelihood function.
Akaike information criterion (AIC).

If constant_or = NULL, two such lists are returned (one under a constant odds ratio assumption and one not), along with a likelihood ratio test for H0: gamma_y = 0, which is equivalent to H0: odds ratio is constant.

References

Lyles, R.H., Van Domelen, D.R., Mitchell, E.M. and Schisterman, E.F. (2015) "A discriminant function approach to adjust for processing and measurement error When a biomarker is assayed in pooled samples." Int. J. Environ. Res. Public Health 12(11): 14723–14740.

Mitchell, E.M, Lyles, R.H., and Schisterman, E.F. (2015) "Positing, fitting, and selecting regression models for pooled biomarker data." Stat. Med 34(17): 2544–2558.

Schisterman, E.F., Vexler, A., Mumford, S.L. and Perkins, N.J. (2010) "Hybrid pooled-unpooled design for cost-efficient measurement of biomarkers." Stat. Med. 29(5): 597–613.

Whitcomb, B.W., Perkins, N.J., Zhang, Z., Ye, A., and Lyles, R. H. (2012) "Assessment of skewed exposure in case-control studies with pooling." Stat. Med. 31: 2461–2472.

Examples

# Load data frame with (g, Y, X, Xtilde) values for 496 pools, list of C
# values for members of each pool, and list of Xtilde values where 25
# single-specimen pools have replicates. Xtilde values are affected by
# processing error and measurement error. True log-OR = 0.5, sigsq_p = 0.25,
# sigsq_m = 0.1.
data(dat_p_gdfa)
dat <- dat_p_gdfa$dat
reps <- dat_p_gdfa$reps
c.list <- dat_p_gdfa$c.list

# Unobservable truth estimator - use precise X's
fit.unobservable <- p_gdfa(
  g = dat$g,
  y = dat$y,
  xtilde = dat$x,
  c = c.list,
  errors = "neither"
)
fit.unobservable$estimates

# Naive estimator - use imprecise Xtilde's, but treat as precise
fit.naive <- p_gdfa(
  g = dat$g,
  y = dat$y,
  xtilde = dat$xtilde,
  c = c.list,
  errors = "neither"
)
fit.naive$estimates

# Corrected estimator - use Xtilde's and account for errors (not using
# replicates here)
## Not run: 
fit.noreps <- p_gdfa(
  g = dat$g,
  y = dat$y,
  xtilde = dat$xtilde,
  c = c.list,
  errors = "both"
)
fit.noreps$estimates

# Corrected estimator - use Xtilde's including 25 replicates
fit.reps <- p_gdfa(
  g = dat$g,
  y = dat$y,
  xtilde = reps,
  c = c.list,
  errors = "both"
)
fit.reps$estimates

# Same as previous, but allowing for non-constant odds ratio.
fit.nonconstant <- p_gdfa(
  g = dat$g,
  y = dat$y,
  xtilde = reps,
  c = c.list,
  constant_or = FALSE,
  errors = "both",
  hcubature_list = list(tol = 1e-4)
)
fit.nonconstant$estimates

# Visualize estimated log-OR vs. X based on previous model fit
p <- plot_gdfa(
  estimates = fit.nonconstant$estimates,
  varcov = fit.nonconstant$theta.var,
  xrange = range(dat$xtilde[dat$g == 1]),
  cvals = mean(unlist(c))
)
p

## End(Not run)

[Package pooling version 1.1.2 Index]