p_logreg_xerrors2 {pooling}R Documentation

Poolwise Logistic Regression with Gamma Exposure Subject to Errors

Description

Assumes constant-scale Gamma model for exposure given covariates, and multiplicative lognormal processing errors and measurement errors acting on the poolwise mean exposure. Manuscript fully describing the approach is under review.

Usage

p_logreg_xerrors2(g = NULL, y, xtilde, c = NULL,
  errors = "processing", nondiff_pe = TRUE, nondiff_me = TRUE,
  constant_pe = TRUE, prev = NULL, samp_y1y0 = NULL,
  estimate_var = TRUE, start_nonvar_var = c(0.01, 1),
  lower_nonvar_var = c(-Inf, 1e-04), upper_nonvar_var = c(Inf, Inf),
  jitter_start = 0.01, hcubature_list = list(tol = 1e-08),
  nlminb_list = list(control = list(trace = 1, eval.max = 500, iter.max =
  500)), hessian_list = list(method.args = list(r = 4)),
  nlminb_object = NULL)

Arguments

g

Numeric vector with pool sizes, i.e. number of members in each pool.

y

Numeric vector with poolwise Y values, coded 0 if all members are controls and 1 if all members are cases.

xtilde

Numeric vector (or list of numeric vectors, if some pools have replicates) with Xtilde values.

c

List where each element is a numeric matrix containing the C values for members of a particular pool (1 row for each member).

errors

Character string specifying the errors that X is subject to. Choices are "neither", "processing" for processing error only, "measurement" for measurement error only, and "both".

nondiff_pe

Logical value for whether to assume the processing error variance is non-differential, i.e. the same in case pools and control pools.

nondiff_me

Logical value for whether to assume the measurement error variance is non-differential, i.e. the same in case pools and control pools.

constant_pe

Logical value for whether to assume the processing error variance is constant with pool size. If FALSE, assumption is that processing error variance increase with pool size such that, for example, the processing error affecting a pool 2x as large as another has 2x the variance.

prev

Numeric value specifying disease prevalence, allowing for valid estimation of the intercept with case-control sampling. Can specify samp_y1y0 instead if sampling rates are known.

samp_y1y0

Numeric vector of length 2 specifying sampling probabilities for cases and controls, allowing for valid estimation of the intercept with case-control sampling. Can specify prev instead if it's easier.

estimate_var

Logical value for whether to return variance-covariance matrix for parameter estimates.

start_nonvar_var

Numeric vector of length 2 specifying starting value for non-variance terms and variance terms, respectively.

lower_nonvar_var

Numeric vector of length 2 specifying lower bound for non-variance terms and variance terms, respectively.

upper_nonvar_var

Numeric vector of length 2 specifying upper bound for non-variance terms and variance terms, respectively.

jitter_start

Numeric value specifying standard deviation for mean-0 normal jitters to add to starting values for a second try at maximizing the log-likelihood, should the initial call to nlminb result in non-convergence. Set to NULL for no second try.

hcubature_list

List of arguments to pass to hcubature for numerical integration.

nlminb_list

List of arguments to pass to nlminb for log-likelihood maximization.

hessian_list

List of arguments to pass to hessian for approximating the Hessian matrix. Only used if estimate_var = TRUE.

nlminb_object

Object returned from nlminb in a prior call. Useful for bypassing log-likelihood maximization if you just want to re-estimate the Hessian matrix with different options.

Value

List containing:

  1. Numeric vector of parameter estimates.

  2. Variance-covariance matrix (if estimate_var = TRUE).

  3. Returned nlminb object from maximizing the log-likelihood function.

  4. Akaike information criterion (AIC).

References

Mitchell, E.M, Lyles, R.H., and Schisterman, E.F. (2015) "Positing, fitting, and selecting regression models for pooled biomarker data." Stat. Med 34(17): 2544–2558.

Schisterman, E.F., Vexler, A., Mumford, S.L. and Perkins, N.J. (2010) "Hybrid pooled-unpooled design for cost-efficient measurement of biomarkers." Stat. Med. 29(5): 597–613.

Weinberg, C.R. and Umbach, D.M. (1999) "Using pooled exposure assessment to improve efficiency in case-control studies." Biometrics 55: 718–726.

Weinberg, C.R. and Umbach, D.M. (2014) "Correction to 'Using pooled exposure assessment to improve efficiency in case-control studies' by Clarice R. Weinberg and David M. Umbach; 55, 718–726, September 1999." Biometrics 70: 1061.

Whitcomb, B.W., Perkins, N.J., Zhang, Z., Ye, A., and Lyles, R. H. (2012) "Assessment of skewed exposure in case-control studies with pooling." Stat. Med. 31: 2461–2472.

Examples

# Load dataset with (g, Y, Xtilde, C) values for 248 pools and list of C
# values for members of each pool. Xtilde values are affected by processing
# error.
data(pdat2)
dat <- pdat2$dat
c.list <- pdat2$c.list

# Estimate log-OR for X and Y adjusted for C, ignoring processing error
fit1 <- p_logreg_xerrors2(
  g = dat$g,
  y = dat$y,
  xtilde = dat$xtilde,
  c = c.list,
  errors = "neither"
)
fit1$theta.hat

# Repeat, but accounting for processing error.
## Not run: 
fit2 <- p_logreg_xerrors2(
  g = dat$g,
  y = dat$y,
  xtilde = dat$xtilde,
  c = c.list,
  errors = "processing"
)
fit2$theta.hat

## End(Not run)



[Package pooling version 1.1.2 Index]