R: Poolwise Logistic Regression with Normal Exposure Subject to...

p_logreg_xerrors {pooling}

R Documentation

Poolwise Logistic Regression with Normal Exposure Subject to Errors

Description

Assumes normal linear model for exposure given covariates, and additive normal processing errors and measurement errors acting on the poolwise mean exposure. Manuscript fully describing the approach is under review.

Usage

p_logreg_xerrors(g, y, xtilde, c = NULL, errors = "processing",
  nondiff_pe = TRUE, nondiff_me = TRUE, constant_pe = TRUE,
  prev = NULL, samp_y1y0 = NULL, approx_integral = TRUE,
  estimate_var = TRUE, start_nonvar_var = c(0.01, 1),
  lower_nonvar_var = c(-Inf, 1e-04), upper_nonvar_var = c(Inf, Inf),
  jitter_start = 0.01, hcubature_list = list(tol = 1e-08),
  nlminb_list = list(control = list(trace = 1, eval.max = 500, iter.max =
  500)), hessian_list = list(method.args = list(r = 4)),
  nlminb_object = NULL)

Arguments

`g`	Numeric vector with pool sizes, i.e. number of members in each pool.
`y`	Numeric vector with poolwise Y values, coded 0 if all members are controls and 1 if all members are cases.
`xtilde`	Numeric vector (or list of numeric vectors, if some pools have replicates) with Xtilde values.
`c`	Numeric matrix with poolwise C values (if any), with one row for each pool. Can be a vector if there is only 1 covariate.
`errors`	Character string specifying the errors that X is subject to. Choices are `"neither"`, `"processing"` for processing error only, `"measurement"` for measurement error only, and `"both"`.
`nondiff_pe`	Logical value for whether to assume the processing error variance is non-differential, i.e. the same in case pools and control pools.
`nondiff_me`	Logical value for whether to assume the measurement error variance is non-differential, i.e. the same in case pools and control pools.
`constant_pe`	Logical value for whether to assume the processing error variance is constant with pool size. If `FALSE`, assumption is that processing error variance increase with pool size such that, for example, the processing error affecting a pool 2x as large as another has 2x the variance.
`prev`	Numeric value specifying disease prevalence, allowing for valid estimation of the intercept with case-control sampling. Can specify `samp_y1y0` instead if sampling rates are known.
`samp_y1y0`	Numeric vector of length 2 specifying sampling probabilities for cases and controls, allowing for valid estimation of the intercept with case-control sampling. Can specify `prev` instead if it's easier.
`approx_integral`	Logical value for whether to use the probit approximation for the logistic-normal integral, to avoid numerically integrating X's out of the likelihood function.
`estimate_var`	Logical value for whether to return variance-covariance matrix for parameter estimates.
`start_nonvar_var`	Numeric vector of length 2 specifying starting value for non-variance terms and variance terms, respectively.
`lower_nonvar_var`	Numeric vector of length 2 specifying lower bound for non-variance terms and variance terms, respectively.
`upper_nonvar_var`	Numeric vector of length 2 specifying upper bound for non-variance terms and variance terms, respectively.
`jitter_start`	Numeric value specifying standard deviation for mean-0 normal jitters to add to starting values for a second try at maximizing the log-likelihood, should the initial call to `nlminb` result in non-convergence. Set to `NULL` for no second try.
`hcubature_list`	List of arguments to pass to `hcubature` for numerical integration. Only used if `approx_integral = FALSE`.
`nlminb_list`	List of arguments to pass to `nlminb` for log-likelihood maximization.
`hessian_list`	List of arguments to pass to `hessian` for approximating the Hessian matrix. Only used if `estimate_var = TRUE`.
`nlminb_object`	Object returned from `nlminb` in a prior call. Useful for bypassing log-likelihood maximization if you just want to re-estimate the Hessian matrix with different options.

Value

List containing:

Numeric vector of parameter estimates.
Variance-covariance matrix (if estimate_var = TRUE).
Returned nlminb object from maximizing the log-likelihood function.
Akaike information criterion (AIC).

References

Schisterman, E.F., Vexler, A., Mumford, S.L. and Perkins, N.J. (2010) "Hybrid pooled-unpooled design for cost-efficient measurement of biomarkers." Stat. Med. 29(5): 597–613.

Weinberg, C.R. and Umbach, D.M. (1999) "Using pooled exposure assessment to improve efficiency in case-control studies." Biometrics 55: 718–726.

Weinberg, C.R. and Umbach, D.M. (2014) "Correction to 'Using pooled exposure assessment to improve efficiency in case-control studies' by Clarice R. Weinberg and David M. Umbach; 55, 718–726, September 1999." Biometrics 70: 1061.

Examples

# Load dataset containing (Y, Xtilde, C) values for pools of size 1, 2, and
# 3. Xtilde values are affected by processing error.
data(pdat1)

# Estimate log-OR for X and Y adjusted for C, ignoring processing error
fit1 <- p_logreg_xerrors(
  g = pdat1$g,
  y = pdat1$allcases,
  xtilde = pdat1$xtilde,
  c = pdat1$c,
  errors = "neither"
)
fit1$theta.hat

# Repeat, but accounting for processing error. Closer to true log-OR of 0.5.
fit2 <- p_logreg_xerrors(
  g = pdat1$g,
  y = pdat1$allcases,
  xtilde = pdat1$xtilde,
  c = pdat1$c,
  errors = "processing"
)
fit2$theta.hat

[Package pooling version 1.1.2 Index]