R: Poolwise Logistic Regression with Gamma Exposure Subject to...

p_logreg_xerrors2 {pooling}

R Documentation

Poolwise Logistic Regression with Gamma Exposure Subject to Errors

Description

Assumes constant-scale Gamma model for exposure given covariates, and multiplicative lognormal processing errors and measurement errors acting on the poolwise mean exposure. Manuscript fully describing the approach is under review.

Usage

p_logreg_xerrors2(g = NULL, y, xtilde, c = NULL,
  errors = "processing", nondiff_pe = TRUE, nondiff_me = TRUE,
  constant_pe = TRUE, prev = NULL, samp_y1y0 = NULL,
  estimate_var = TRUE, start_nonvar_var = c(0.01, 1),
  lower_nonvar_var = c(-Inf, 1e-04), upper_nonvar_var = c(Inf, Inf),
  jitter_start = 0.01, hcubature_list = list(tol = 1e-08),
  nlminb_list = list(control = list(trace = 1, eval.max = 500, iter.max =
  500)), hessian_list = list(method.args = list(r = 4)),
  nlminb_object = NULL)

Arguments

`g`	Numeric vector with pool sizes, i.e. number of members in each pool.
`y`	Numeric vector with poolwise Y values, coded 0 if all members are controls and 1 if all members are cases.
`xtilde`	Numeric vector (or list of numeric vectors, if some pools have replicates) with Xtilde values.
`c`	List where each element is a numeric matrix containing the C values for members of a particular pool (1 row for each member).
`errors`	Character string specifying the errors that X is subject to. Choices are `"neither"`, `"processing"` for processing error only, `"measurement"` for measurement error only, and `"both"`.
`nondiff_pe`	Logical value for whether to assume the processing error variance is non-differential, i.e. the same in case pools and control pools.
`nondiff_me`	Logical value for whether to assume the measurement error variance is non-differential, i.e. the same in case pools and control pools.
`constant_pe`	Logical value for whether to assume the processing error variance is constant with pool size. If `FALSE`, assumption is that processing error variance increase with pool size such that, for example, the processing error affecting a pool 2x as large as another has 2x the variance.
`prev`	Numeric value specifying disease prevalence, allowing for valid estimation of the intercept with case-control sampling. Can specify `samp_y1y0` instead if sampling rates are known.
`samp_y1y0`	Numeric vector of length 2 specifying sampling probabilities for cases and controls, allowing for valid estimation of the intercept with case-control sampling. Can specify `prev` instead if it's easier.
`estimate_var`	Logical value for whether to return variance-covariance matrix for parameter estimates.
`start_nonvar_var`	Numeric vector of length 2 specifying starting value for non-variance terms and variance terms, respectively.
`lower_nonvar_var`	Numeric vector of length 2 specifying lower bound for non-variance terms and variance terms, respectively.
`upper_nonvar_var`	Numeric vector of length 2 specifying upper bound for non-variance terms and variance terms, respectively.
`jitter_start`	Numeric value specifying standard deviation for mean-0 normal jitters to add to starting values for a second try at maximizing the log-likelihood, should the initial call to `nlminb` result in non-convergence. Set to `NULL` for no second try.
`hcubature_list`	List of arguments to pass to `hcubature` for numerical integration.
`nlminb_list`	List of arguments to pass to `nlminb` for log-likelihood maximization.
`hessian_list`	List of arguments to pass to `hessian` for approximating the Hessian matrix. Only used if `estimate_var = TRUE`.
`nlminb_object`	Object returned from `nlminb` in a prior call. Useful for bypassing log-likelihood maximization if you just want to re-estimate the Hessian matrix with different options.

Value

List containing:

Numeric vector of parameter estimates.
Variance-covariance matrix (if estimate_var = TRUE).
Returned nlminb object from maximizing the log-likelihood function.
Akaike information criterion (AIC).

References

Mitchell, E.M, Lyles, R.H., and Schisterman, E.F. (2015) "Positing, fitting, and selecting regression models for pooled biomarker data." Stat. Med 34(17): 2544–2558.

Schisterman, E.F., Vexler, A., Mumford, S.L. and Perkins, N.J. (2010) "Hybrid pooled-unpooled design for cost-efficient measurement of biomarkers." Stat. Med. 29(5): 597–613.

Weinberg, C.R. and Umbach, D.M. (1999) "Using pooled exposure assessment to improve efficiency in case-control studies." Biometrics 55: 718–726.

Weinberg, C.R. and Umbach, D.M. (2014) "Correction to 'Using pooled exposure assessment to improve efficiency in case-control studies' by Clarice R. Weinberg and David M. Umbach; 55, 718–726, September 1999." Biometrics 70: 1061.

Whitcomb, B.W., Perkins, N.J., Zhang, Z., Ye, A., and Lyles, R. H. (2012) "Assessment of skewed exposure in case-control studies with pooling." Stat. Med. 31: 2461–2472.

Examples

# Load dataset with (g, Y, Xtilde, C) values for 248 pools and list of C
# values for members of each pool. Xtilde values are affected by processing
# error.
data(pdat2)
dat <- pdat2$dat
c.list <- pdat2$c.list

# Estimate log-OR for X and Y adjusted for C, ignoring processing error
fit1 <- p_logreg_xerrors2(
  g = dat$g,
  y = dat$y,
  xtilde = dat$xtilde,
  c = c.list,
  errors = "neither"
)
fit1$theta.hat

# Repeat, but accounting for processing error.
## Not run: 
fit2 <- p_logreg_xerrors2(
  g = dat$g,
  y = dat$y,
  xtilde = dat$xtilde,
  c = c.list,
  errors = "processing"
)
fit2$theta.hat

## End(Not run)

[Package pooling version 1.1.2 Index]