p_logreg_xerrors {pooling} | R Documentation |
Poolwise Logistic Regression with Normal Exposure Subject to Errors
Description
Assumes normal linear model for exposure given covariates, and additive normal processing errors and measurement errors acting on the poolwise mean exposure. Manuscript fully describing the approach is under review.
Usage
p_logreg_xerrors(g, y, xtilde, c = NULL, errors = "processing",
nondiff_pe = TRUE, nondiff_me = TRUE, constant_pe = TRUE,
prev = NULL, samp_y1y0 = NULL, approx_integral = TRUE,
estimate_var = TRUE, start_nonvar_var = c(0.01, 1),
lower_nonvar_var = c(-Inf, 1e-04), upper_nonvar_var = c(Inf, Inf),
jitter_start = 0.01, hcubature_list = list(tol = 1e-08),
nlminb_list = list(control = list(trace = 1, eval.max = 500, iter.max =
500)), hessian_list = list(method.args = list(r = 4)),
nlminb_object = NULL)
Arguments
g |
Numeric vector with pool sizes, i.e. number of members in each pool. |
y |
Numeric vector with poolwise Y values, coded 0 if all members are controls and 1 if all members are cases. |
xtilde |
Numeric vector (or list of numeric vectors, if some pools have replicates) with Xtilde values. |
c |
Numeric matrix with poolwise C values (if any), with one row for each pool. Can be a vector if there is only 1 covariate. |
errors |
Character string specifying the errors that X is subject to.
Choices are |
nondiff_pe |
Logical value for whether to assume the processing error variance is non-differential, i.e. the same in case pools and control pools. |
nondiff_me |
Logical value for whether to assume the measurement error variance is non-differential, i.e. the same in case pools and control pools. |
constant_pe |
Logical value for whether to assume the processing error
variance is constant with pool size. If |
prev |
Numeric value specifying disease prevalence, allowing
for valid estimation of the intercept with case-control sampling. Can specify
|
samp_y1y0 |
Numeric vector of length 2 specifying sampling probabilities
for cases and controls, allowing for valid estimation of the intercept with
case-control sampling. Can specify |
approx_integral |
Logical value for whether to use the probit approximation for the logistic-normal integral, to avoid numerically integrating X's out of the likelihood function. |
estimate_var |
Logical value for whether to return variance-covariance matrix for parameter estimates. |
start_nonvar_var |
Numeric vector of length 2 specifying starting value for non-variance terms and variance terms, respectively. |
lower_nonvar_var |
Numeric vector of length 2 specifying lower bound for non-variance terms and variance terms, respectively. |
upper_nonvar_var |
Numeric vector of length 2 specifying upper bound for non-variance terms and variance terms, respectively. |
jitter_start |
Numeric value specifying standard deviation for mean-0
normal jitters to add to starting values for a second try at maximizing the
log-likelihood, should the initial call to |
hcubature_list |
List of arguments to pass to
|
nlminb_list |
List of arguments to pass to |
hessian_list |
List of arguments to pass to
|
nlminb_object |
Object returned from |
Value
List containing:
Numeric vector of parameter estimates.
Variance-covariance matrix (if
estimate_var = TRUE
).Returned
nlminb
object from maximizing the log-likelihood function.Akaike information criterion (AIC).
References
Schisterman, E.F., Vexler, A., Mumford, S.L. and Perkins, N.J. (2010) "Hybrid pooled-unpooled design for cost-efficient measurement of biomarkers." Stat. Med. 29(5): 597–613.
Weinberg, C.R. and Umbach, D.M. (1999) "Using pooled exposure assessment to improve efficiency in case-control studies." Biometrics 55: 718–726.
Weinberg, C.R. and Umbach, D.M. (2014) "Correction to 'Using pooled exposure assessment to improve efficiency in case-control studies' by Clarice R. Weinberg and David M. Umbach; 55, 718–726, September 1999." Biometrics 70: 1061.
Examples
# Load dataset containing (Y, Xtilde, C) values for pools of size 1, 2, and
# 3. Xtilde values are affected by processing error.
data(pdat1)
# Estimate log-OR for X and Y adjusted for C, ignoring processing error
fit1 <- p_logreg_xerrors(
g = pdat1$g,
y = pdat1$allcases,
xtilde = pdat1$xtilde,
c = pdat1$c,
errors = "neither"
)
fit1$theta.hat
# Repeat, but accounting for processing error. Closer to true log-OR of 0.5.
fit2 <- p_logreg_xerrors(
g = pdat1$g,
y = pdat1$allcases,
xtilde = pdat1$xtilde,
c = pdat1$c,
errors = "processing"
)
fit2$theta.hat