R.s.estimate.me {Rsurrogate} | R Documentation |
Calculates the proportion of treatment effect explained correcting for measurement error in the surrogate marker
Description
This function calculates the proportion of treatment effect on the primary outcome explained by the treatment effect on a surrogate marker, correcting for measurement error in the surrogate marker. This function is intended to be used for a fully observed continuous outcome. The user must specify what type of estimation they would like (parametric or nonparametric estimation of the proportion explained, denoted by R) and what estimator they would like (see below for details).
Usage
R.s.estimate.me(sone, szero, yone, yzero, parametric = FALSE, estimator = "n",
me.variance, extrapolate = TRUE, transform = FALSE, naive = FALSE, Ronly = TRUE)
Arguments
sone |
numeric vector or matrix; surrogate marker for treated observations, assumed to be continuous. If there are multiple surrogates then this should be a matrix with |
szero |
numeric vector; surrogate marker for control observations, assumed to be continuous.If there are multiple surrogates then this should be a matrix with |
yone |
numeric vector; primary outcome for treated observations, assumed to be continuous. |
yzero |
numeric vector; primary outcome for control observations, assumed to be continuous. |
parametric |
TRUE or FALSE; indicates whether the user wants the parametric approach to be used (TRUE) or nonparametric (FALSE). |
estimator |
options are "d","q","n" for parametric and "q","n" for nonparametric; "d" stands for the disattenuated estimator, "q" stands for the SIMEX estimator with quadratic extrapolation, "n" stands for the SIMEX estimator with a nonlinear extrapolation. Note that the nonlinear extrapolation may have convergence issues with a small sample size; if this occurs, please consider using quadratic extrapolation instead. |
me.variance |
the variance of the measurement error; must be provided. |
extrapolate |
TRUE or FALSE; indicates whether the user wants to use extrapolation. |
transform |
TRUE or FALSE; indicates whether the user wants to use a transformation for the surrogate marker. |
naive |
TRUE or FALSE; indicates whether the user wants the naive estimate (not correcting for measurement error) to also be calculated |
Ronly |
TRUE or FALSE; indicates whether the user wants only R (and corresponding variance and confidence intervals) to be returned. |
Details
While there are many methods available to quantify the value of a surrogate marker, most assume that the marker is measured without error. This function calculates the proportion of treatment effect on the primary outcome explained by the treatment effect on a surrogate marker, correcting for measurement error in the surrogate marker. The user can choose either the parametric framework or nonparametric framework for estmation. Within the parametric framework there are three options for measurement error correction: the disattenuated estimator, the SIMEX estimator with quadratic extrapolation, and the SIMEX estimator with nonlinear extrapolation. Within the nonparametric framework there are two options for measurement error correction: the SIMEX estimator with quadratic extrapolation and the SIMEX estimator with nonlinear extrapolation. We describe each below.
Let G
be the binary treatment indicator with G=1
indicating treatment and G=0
indicating control (or placebo). We assume throughout that subjects are randomly assigned to treatment or control at baseline. Let Y
and S
denote the continuous primary outcome and continuous surrogate marker, respectively, where S
is measured post-baseline and is assumed to be a biomarker, clinical measurement, psychological test score, or other physiological measurement. In the absence of measurement error, the observed data consists of \{Y_i, S_i, G_i\}
for i \in \{1,...,n\}
. With measurement error, instead of observing S
we observe W = S + U
, where E(U|S) = 0
and the variance of U
is \sigma_u^2
. Such measurement error may be attributable to, for example, laboratory error. Thus, our observed data will consist of \{Y_i, W_i, G_i\}
for i \in \{1,...,n\}
. Throughout, we assume that \sigma_u^2
is known. Here, we are interested in estimating the proprtion of the treatment effect on the primary outcome that is explained by the treatment effect on the surrogate marker, denoted as R_S
.
To estimate R_S
parametrically, we assume the following models E(Y|G) = \beta_0 + \beta_1 G
and E(Y|G,S) = \beta_0^* + \beta_1^*G + \beta_2^* S
. It can be shown that if these models hold, R_S=1-\beta_1^*/\beta_1
. When W = S+U
is available instead of S
, this measurement error does not affect estimation of \beta_1
, but it does affect estimation of \beta_1^*
, and \beta_2^*
. Since estimation of R_S
relies on estimation of \beta_1
and \beta_1^*
, we focus on the effect of measurement error on \beta_1^*
estimation. The attenuation bias for \hat \beta_1^*
and \hat R
can be written out in closed form when the proportion of treatment effect is parametrically estimated as described above, when these specified models hold, and when the surrogate marker S
is measured with error. There exist two methods to eliminate this bias when estimating R_S
. Taking advantage of the fact that we can express the attenuation bias in closed form, the first is a straightforward disattenuated estimator: \hat \beta _{1A} = \hat{\beta}_1^* - \frac{ \hat{\beta}_2^* \{\Omega^2_{W} \Omega_{GW}-\Omega_{GW}(\Omega^2_{W} - \sigma_u^2)\}}{\Omega^2_{G}(\Omega^2_{W} - \sigma_u^2)-\Omega_{GW}\Omega_{GW}}
and \hat{R}_{A} = 1- \left [ \hat{\beta}_1^* - \frac{ \hat{\beta}_2^* \{\Omega^2_{W} \Omega_{GW}-\Omega_{GW}(\Omega^2_{W} - \sigma_u^2)\}}{\Omega^2_{G}(\Omega^2_{W} - \sigma_u^2)-\Omega_{GW}\Omega_{GW}} \right] / \hat{\beta}_1
where \Omega^2
denotes the sample variance or covariance.
The second method to eliminate this bias uses Simulation Extrapolation (SIMEX) estimation, which is a simulation-based method that involves first generating additional measurement error and observing how it affects the bias of the parameter estimate of interest, and then extrapolating this information to a setting with no measurement error. To incorporate SIMEX estimation within our surrogate marker framework, we define W_{b,i}(\lambda) = W_i + \lambda^{1/2} \sigma_u \epsilon_{i,b}
for b=1,...,B
where B=50
, \epsilon_{i,b} \sim N(0,1)
, \sigma_u
is assumed known, and \lambda \in (0,0.25,0.5,0.75,1.0,
1.25,1.5,1.75,2.0)
and for each iteration b
and \lambda
value, obtaining \hat \beta_{1b}^*(\lambda)
by fitting the regression model: E(Y \mid W_b(\lambda),S) = \beta_{0b}^* + \beta_{1b}^* W_{b}(\lambda) + \beta_{2b}^* S.
We then calculate the average estimate for each quantity over the iterations b=1,...,B
for each \lambda
value, denoted as \hat \beta^*_{1,S,\sigma^2_u(1+\lambda)} = \sum_{b=1}^B \hat \beta_{1b}^*(\lambda)
. The second step, extrapolation, takes these average estimates for each \lambda
value and extrapolates using a function G(\Gamma, \lambda)
to obtain the estimated quantity if \lambda=-1
. For the extrapolation step, we use both a quadratic extrapolation and nonlinear extrapolation i.e., we solve for \Gamma = (\alpha_0, \alpha_1, \alpha_2)^T
in \hat \beta^*_{1,S,\sigma^2_u(1+\lambda)} = \alpha_0 + \alpha_1 \lambda + \alpha_2 \lambda^2
and \hat \beta^*_{1,S,\sigma^2_u(1+\lambda)}= \alpha_0 + \alpha_1 /( \alpha_2 + \lambda)
, respectively. Using the estimates of \alpha_0, \alpha_1, \alpha_2
, we calculate the predicted \hat \beta^*_{1,S,\sigma^2_u(1+\lambda)}
when \lambda = -1
. In essence, the simulations add successively larger measurement errors of size (1+\lambda)\sigma^2_u
and then extrapolate to the case when \lambda = -1
such that the measurement error is 0. We denote the resulting estimator of \beta_1^*
as \hat{\beta}^*_{1,SIMEX} = G(\hat \Gamma, -1)
and define \hat{R}_{SIMEX} = 1- \hat{\beta}^*_{1,SIMEX}/ \hat \beta_1.
While the parametric approach to estimate the proportion of treatment effect explained by S
is most commonly used in clinical practice, previous work has demonstrated biased results when the assumed models are not correctly specified. An alternative approach involves estimating the treatment effect, \Delta
, and residual treatment effect, \Delta_S
, as R_S
is defined as 1-\Delta/\Delta_S
. The quantity \Delta
can be estimated simply by \hat{\Delta} = n_1^{-1}\sum_{i=1}^{n} Y_i I(G_i = 1) - n_0^{-1}\sum_{i=1}^{n} Y_i I(G_i = 0)
, where n_1
and n_0
denote the number of individuals in the treatment and control groups, respectively. The quantity \Delta_S
can be estimated nonparametrically using kernel smoothing as \hat{\Delta}_S = n_0^{-1} \sum_{i: G_i = 0}\hat{\mu}_1(S_i) - n_0^{-1}\sum_{i=1}^{n} Y_i I(G_i = 0)
where \hat{\mu}_1(s) = \{ \sum_{j: G_j = 1} K_h(S_j - s)Y_j \}/ \{\sum_{j:G_j = 1} K_h(S_j - s)\}
, K(\cdot)
is a smooth symmetric density function with finite support, K_h(\cdot)=K(\cdot/h)/h
and h
is a specified bandwidth such that h=O(n_1^{-\nu})
with \nu \in (1/4,1/2).
When W = S + U
is available instead of S
, estimation of \Delta
is not affected whereas estimation of \Delta_S
is affected and thus, the nonparametric estimation procedure described above results in a biased estimate of R_S
. Unlike the parametric approach, the attenuation bias cannot be expressed in closed form. Within this nonparametric framework, SIMEX estimation can be used to correct for measurement error. We implement the estimation procedure as described above where we first generate additional measurement error to obtain W_{b,i}(\lambda)
and for each iteration b
and \lambda
values obtain \hat{\Delta}_{S,b}(\lambda) = n_0^{-1} \sum_{i: G_i = 0} \left \{ \frac{\sum_{j: G_j = 1} K_h(W_{b,j}(\lambda) - W_{b,i}(\lambda))Y_j}{\sum_{j:G_j = 1} K_h(W_{b,j}(\lambda)- W_{b,i}(\lambda))} \right \} - n_0^{-1}\sum_{i=1}^{n} Y_i I(G_i = 0).
We then calculate the average estimate for each quantity over the iterations b=1,...,B
for each \lambda
value, denoted as \hat{\Delta}_{S,\sigma_u^2(1+\lambda)} = \sum_{b=1}^B \hat{\Delta}_{S,b}(\lambda)
and extrapolate using a function G(\Gamma, \lambda)
; we specifically use the quadratic and nonlinear functions as in the parametric setting. We denote the resulting estimator of \Delta_S
as \hat{\Delta}_{S,SIMEX} = G(\hat \Gamma, -1)
and define \hat{R}_{S,SIMEX} = 1- \hat{\Delta}_{S,SIMEX} / \hat \Delta.
In this function, parametric estimation is equivalent to Freedman's approach in the R.s.estimate documentation; nonparametric estimation is equivalent to the robust approach in the R.s.estimate documentation. Variance estimates for all estimators are calculated in this function based on derived closed form variance expressions. For all approaches, confidence intervals for \Delta_S
can be constructed using a normal approximation; confidence intervals for R_S
can be constructed using either a normal approximation or using Fieller's method, all of which are provided in this function. Details regarding the asymptotic properties of these estimators and closed form variance calculation can be found in: Parast, L., Garcia, T. P., Prentice, R. L., & Carroll, R. J. (2022). Robust methods to correct for measurement error when evaluating a surrogate marker. Biometrics, 78(1), 9-23.
Value
A list is returned:
R.naive |
the naive estimate of the proportion of treatment effect explained by the surrogate marker; only if naive = TRUE |
R.naive.var |
the estimated variance of the naive estimate of the proportion of treatment effect explained by the surrogate marker; only if naive = TRUE |
R.naive.CI.normal |
the 95% confidence interval using the normal approximation for the naive estimate of the proportion of treatment effect explained by the surrogate marker; only if naive = TRUE |
R.naive.CI.fieller |
the 95% confidence interval using Fieller's approach for the naive estimate of the proportion of treatment effect explained by the surrogate marker; only if naive = TRUE |
B1star.naive |
the naive estimate of the adjusted regression coefficient for treatment; only if naive = TRUE and Ronly = FALSE and parametric = TRUE |
B1star.naive.var |
the estimated variance of the naive estimate of the adjusted regression coefficient for treatment; only if naive = TRUE and Ronly = FALSE and parametric = TRUE |
B1star.naive.CI.normal |
the 95% confidence interval using the normal approximation for the naive estimate of the adjusted regression coefficient for treatment; only if naive = TRUE and Ronly = FALSE and parametric = TRUE |
deltas.naive |
the naive estimate of the residual treatment effect; only if naive = TRUE and Ronly = FALSE and parametric = FALSE |
deltas.naive.var |
the estimated variance of the naive estimate of the residual treatment effect; only if naive = TRUE and Ronly = FALSE and parametric = FALSE |
deltas.naive.CI.normal |
the 95% confidence interval using the normal approximation for the naive estimate of the residual treatment effect; only if naive = TRUE and Ronly = FALSE and parametric = FALSE |
R.corrected.dis |
the corrected disattenuated estimate of the proportion of treatment effect explained by the surrogate marker; only if parametric = TRUE and estimator ="d" |
R.corrected.var.dis |
the estimated variance of the corrected disattenuated estimate of the proportion of treatment effect explained by the surrogate marker; only if naive = TRUE |
R.corrected.CI.normal.dis |
the 95% confidence interval using the normal approximation for the corrected disattenuated estimate of the proportion of treatment effect explained by the surrogate marker; only if parametric = TRUE and estimator ="d" |
R.corrected.CI.fieller.dis |
the 95% confidence interval using Fieller's approach for the corrected disattenuated estimate of the proportion of treatment effect explained by the surrogate marker; only if parametric = TRUE and estimator ="d" |
B1star.corrected.dis |
the corrected disattenuated estimate of the adjusted regression coefficient for treatment; only if parametric = TRUE and estimator = "d" and Ronly = FALSE |
B1star.corrected.var.dis |
the estimated variance of the corrected disattenuated estimate of the adjusted regression coefficient for treatment; only if parametric = TRUE and estimator = "d" and Ronly = FALSE |
B1star.corrected.CI.normal.dis |
the 95% confidence interval using the normal approximation for the corrected disattenuated estimate of the adjusted regression coefficient for treatment; only if parametric = TRUE and estimator = "d" and Ronly = FALSE |
R.corrected.q |
the corrected SIMEX (quadratic) estimate of the proportion of treatment effect explained by the surrogate marker; only if estimator = "q" |
R.corrected.var.q |
the estimated variance of the corrected SIMEX (quadratic) estimate of the proportion of treatment effect explained by the surrogate marker; only if estimator = "q" |
R.corrected.CI.normal.q |
the 95% confidence interval using the normal approximation for the corrected SIMEX (quadratic) estimate of the proportion of treatment effect explained by the surrogate marker; only if estimator = "q" |
R.corrected.CI.fieller.q |
the 95% confidence interval using Fieller's approach for the corrected SIMEX (quadratic) estimate of the proportion of treatment effect explained by the surrogate marker; only if estimator = "q" |
B1star.corrected.q |
the corrected SIMEX (quadratic) estimate of the adjusted regression coefficient for treatment; only if estimator = "q" and Ronly = FALSE and parametric = TRUE |
B1star.corrected.var.q |
the estimated variance of the corrected SIMEX (quadratic) estimate of the adjusted regression coefficient for treatment; only if estimator = "q" and Ronly = FALSE and parametric = TRUE |
B1star.corrected.CI.normal.q |
the 95% confidence interval using the normal approximation for the corrected SIMEX (quadratic) estimate of the adjusted regression coefficient for treatment; only if estimator = "q" and Ronly = FALSE and parametric = TRUE |
deltas.corrected.q |
the corrected SIMEX (quadratic) estimate of the residual treatment effect; only if estimator = "q" and Ronly = FALSE and parametric = FALSE |
deltas.corrected.var.q |
the estimated variance of the corrected SIMEX (quadratic) estimate of the residual treatment effect; only if estimator = "q" and Ronly = FALSE and parametric = FALSE |
deltas.corrected.CI.normal.q |
the 95% confidence interval using the normal approximation for the corrected SIMEX (quadratic) estimate of the residual treatment effect; only if estimator = "q" and Ronly = FALSE and parametric = FALSE |
R.corrected.nl |
the corrected SIMEX (nonlinear) estimate of the proportion of treatment effect explained by the surrogate marker; only if estimator = "q" |
R.corrected.var.nl |
the estimated variance of the corrected SIMEX (nonlinear) estimate of the proportion of treatment effect explained by the surrogate marker; only if estimator = "q" |
R.corrected.CI.normal.nl |
the 95% confidence interval using the normal approximation for the corrected SIMEX (nonlinear) estimate of the proportion of treatment effect explained by the surrogate marker; only if estimator = "q" |
R.corrected.CI.fieller.nl |
the 95% confidence interval using Fieller's approach for the corrected SIMEX (nonlinear) estimate of the proportion of treatment effect explained by the surrogate marker; only if estimator = "q" |
B1star.corrected.nl |
the corrected SIMEX (nonlinear) estimate of the adjusted regression coefficient for treatment; only if estimator = "q" and Ronly = FALSE and parametric = TRUE |
B1star.corrected.var.nl |
the estimated variance of the corrected SIMEX (nonlinear) estimate of the adjusted regression coefficient for treatment; only if estimator = "q" and Ronly = FALSE and parametric = TRUE |
B1star.corrected.CI.normal.nl |
the 95% confidence interval using the normal approximation for the corrected SIMEX (nonlinear) estimate of the adjusted regression coefficient for treatment; only if estimator = "q" and Ronly = FALSE and parametric = TRUE |
deltas.corrected.nl |
the corrected SIMEX (nonlinear) estimate of the residual treatment effect; only if estimator = "q" and Ronly = FALSE and parametric = FALSE |
deltas.corrected.var.nl |
the estimated variance of the corrected SIMEX (nonlinear) estimate of the residual treatment effect; only if estimator = "q" and Ronly = FALSE and parametric = FALSE |
deltas.corrected.CI.normal.nl |
the 95% confidence interval using the normal approximation for the corrected SIMEX (nonlinear) estimate of the residual treatment effect; only if estimator = "q" and Ronly = FALSE and parametric = FALSE |
Author(s)
Layla Parast
References
Parast, L., Garcia, T. P., Prentice, R. L., & Carroll, R. J. (2022). Robust methods to correct for measurement error when evaluating a surrogate marker. Biometrics, 78(1), 9-23.
Examples
data(d_example_me)
names(d_example_me)
R.s.estimate.me(yone=d_example_me$y1, yzero=d_example_me$y0, sone=d_example_me$s1,
szero=d_example_me$s0, parametric = TRUE, estimator = "d", me.variance = 0.5,
naive= TRUE, Ronly = FALSE)
R.s.estimate.me(yone=d_example_me$y1, yzero=d_example_me$y0, sone=d_example_me$s1,
szero=d_example_me$s0, parametric = TRUE, estimator = "q", me.variance = 0.5,
naive= FALSE, Ronly = TRUE)
#estimating measurement error variance with replicates
replicates = rbind(cbind(d_example_me$s1_rep1, d_example_me$s1_rep2,
d_example_me$s1_rep3), cbind(d_example_me$s0_rep1, d_example_me$s0_rep2,
d_example_me$s0_rep3))
mean.i = apply(replicates,1,mean, na.rm = TRUE)
num.i = apply(replicates,1,function(x) sum(!is.na(x)))
var.u = sum((replicates-mean.i)^2, na.rm = TRUE)/sum(num.i)
var.u
R.s.estimate.me(yone=d_example_me$y1, yzero=d_example_me$y0, sone=d_example_me$s1,
szero=d_example_me$s0, parametric = TRUE, estimator = "d", me.variance = var.u,
naive= TRUE, Ronly = FALSE)
R.s.estimate.me(yone=d_example_me$y1, yzero=d_example_me$y0,
sone=d_example_me$s1, szero=d_example_me$s0, parametric = FALSE, estimator = "q",
me.variance = 0.5, naive= FALSE, Ronly = TRUE)