speff {speff2trial} | R Documentation |
Semiparametric efficient estimation and testing for a two-sample treatment effect with a quantitative or dichotomous endpoint
Description
speff
conducts estimation and testing of the treatment effect in a 2-group randomized
clinical trial with a quantitative or dichotomous endpoint. The method is a special case of Robins, Rotnitzky, and
Zhao (1994, JASA). It improves efficiency by leveraging baseline predictors of the endpoint. The method uses
inverse probability weighting to provide unbiased estimation when the endpoint is missing at random.
Usage
speff(formula, endpoint=c("quantitative", "dichotomous"), data,
postrandom=NULL, force.in=NULL, nvmax=9,
method=c("exhaustive", "forward", "backward"),
optimal=c("cp", "bic", "rsq"), trt.id, conf.level=0.95,
missCtrl=NULL, missTreat=NULL, endCtrlPre=NULL,
endTreatPre=NULL, endCtrlPost=NULL, endTreatPost=NULL)
Arguments
formula |
a formula object with the response on the left of the |
endpoint |
a character string specifying the type of the response variable. The option " |
data |
a data frame in which to interpret the variables named in the |
postrandom |
a character vector designating postrandomization covariates included in the formula (this argument allows to distinguish baseline from postrandomiation covariates). |
force.in |
a vector of indices to columns of the design matrix that should be included in each regression model. |
nvmax |
the maximum number of covariates considered for inclusion in a model. The default is 9. |
method |
specifies the type of search technique used in the model selection procedure carried out by the
|
optimal |
specifies the optimization criterion for model selection. The default is " |
trt.id |
a character string specifying the name of the treatment indicator which can be a character or a numeric vector. The control and treatment group is defined by the alphanumeric order of labels used in the treatment indicator. |
conf.level |
the confidence level to be used for confidence intervals reported by |
missCtrl |
estimated probabilities of observing the endpoint based on pre- and postrandomization information in the control group |
missTreat |
estimated probabilities of observing the endpoint based on pre- and postrandomization information in the treatment group |
endCtrlPre |
predicted values of the endpoint using baseline information in the control group only |
endTreatPre |
predicted values of the endpoint using baseline information in the treatment group only |
endCtrlPost |
predicted values of the endpoint using baseline and postrandomization information in the control group |
endTreatPost |
predicted values of the endpoint using baseline and postrandomization information in the treatment group |
Details
The treatment effect is represented by the mean difference or the log odds ratio for a quantitative or dichotomous endpoint, respectively. Estimates of the treatment effect that ignore baseline covariates (naive) are included in the output.
Using the automated model selection procedure performed by regsubsets
, four optimal regression models are
developed for the study endpoint. Initially, all baseline and postrandomization covariates specified in the formula
are considered for inclusion by the model selection procedure carried out separately in each treatment group. The
optimal models are used to construct predicted values of the endpoint. Subsequently, in each treatment group, another
regression model is fitted that includes only baseline covariates that were selected in the previous optimization.
Then predicted values of the endpoint are computed based on these models. If missingness occurs in the endpoint
variable, the model selection procedure is additionally used to determine the optimal models for predicting
whether a subject has an observed endpoint, separately in each treatment group.
The function regsubsets
conducts optimization of linear regression models only. The following modification
in the model selection is adopted for a dichotomous variable: initially, a logistic regression model is fitted
with all baseline and postrandomization covariates included in the formula. Subsequently, an optimal model is
selected by using a weighted linear regression with weights from the last iteration of the IWLS algorithm.
The optimal model is then refitted by logistic regression.
Besides using the built-in model selection algorithms, the user has the option to explicitly enter predicted values of the endpoint as well as estimated probabilities of observing the endpoint if it is missing at random.
Value
speff
returns an object of class "speff
" which can be processed by
summary.speff
to obtain or print a summary of the results. An object of class "speff
"
is a list containing the following components:
coef |
a matrix with estimates of treatment-specific mean responses and the treatment effect. |
cov |
a list with components |
varbeta |
a numeric vector of variance estimates of the naive and semiparametric treatment effect estimates. |
formula |
a list with components |
rsq |
a numeric vector of the R-squared statistics for the optimal selected regression models predicting
the study endpoint. Set to |
endpoint |
" |
postrandom |
a character vector of postrandomization covariates considered for selection. |
predicted |
a logical vector; if |
conf.level |
confidence level of the confidence intervals reported by |
method |
search technique employed in the model selection procedure. |
n |
number of subjects in each treatment group. |
References
Robins JM, Rotnitzky A, Zhao LP. (1994), "Estimation of regression coefficients when some regressors are not always observed.", Journal of the American Statistical Association, 89:846–66.
Tsiatis AA, Davidian M, Zhang M, Lu X. (2007), "Covariate adjustment for two-sample treatment comparisons in randomized clinical trials: A principled yet flexible approach.", Statistics in Medicine, 27:4658–4677.
Zhang M, Tsiatis AA, Davidian M. (2008), "Improving efficiency of inferences in randomized clinical trials using auxiliary covariates.", Biometrics, 64:707–715.
Davidian M, Tsiatis AA, Leon S. (2005), "Semiparametric estimation of treatment effect in a pretest-posttest study with missing data.", Statistical Science, 20:261–301.
Zhang M, Gilbert P. (2009), "Increasing the efficiency of prevential trials by incorporating baseline covariates.", manuscript.
See Also
Examples
str(ACTG175)
### treatment effect estimation with a quantitative endpoint missing
### at random
fit1 <- speff(cd496 ~ age+wtkg+hemo+homo+drugs+karnof+oprior+preanti+
race+gender+str2+strat+symptom+cd40+cd420+cd80+cd820+offtrt,
postrandom=c("cd420","cd820","offtrt"), data=ACTG175, trt.id="treat")
### 'fit2' adds quadratic effects of CD420 and CD820 and their
### two-way interaction
fit2 <- speff(cd496 ~ age+wtkg+hemo+homo+drugs+karnof+oprior+preanti+
race+gender+str2+strat+symptom+cd40+cd420+I(cd420^2)+cd80+cd820+
I(cd820^2)+cd420:cd820+offtrt, postrandom=c("cd420","I(cd420^2)",
"cd820","I(cd820^2)","cd420:cd820","offtrt"), data=ACTG175,
trt.id="treat")
### 'fit3' uses R-squared as the optimization criterion
fit3 <- speff(cd496 ~ age+wtkg+hemo+homo+drugs+karnof+oprior+preanti+
race+gender+str2+strat+symptom+cd40+cd420+cd80+cd820+offtrt,
postrandom=c("cd420","cd820","offtrt"), data=ACTG175, trt.id="treat",
optimal="rsq")
### a dichotomous response is created with missing values maintained
ACTG175$cd496bin <- ifelse(ACTG175$cd496 > 250, 1, 0)
### treatment effect estimation with a dichotomous endpoint missing
### at random
fit4 <- speff(cd496bin ~ age+wtkg+hemo+homo+drugs+karnof+oprior+preanti+
race+gender+str2+strat+symptom+cd40+cd420+cd80+cd820+offtrt,
postrandom=c("cd420","cd820","offtrt"), data=ACTG175, trt.id="treat",
endpoint="dichotomous")