geeDREstimation {CRTgeeDR}R Documentation

Doubly Robust Inverse Probability Weighted Augmented GEE Estimator

Description

This function implements a GEE estimator. It implements classical GEE, IPW-GEE, augmented GEE and IPW-Augmented GEE (Doubly robust).

Usage

geeDREstimation(formula, id, data = parent.frame(), family = gaussian,
  corstr = "independence", Mv = 1, weights = NULL, aug = NULL,
  pi.a = 1/2, corr.mat = NULL, init.beta = NULL, init.alpha = NULL,
  init.phi = 1, scale.fix = FALSE, sandwich = TRUE, maxit = 20,
  tol = 1e-05, print.log = FALSE, typeweights = "VW", nameTRT = "TRT",
  model.weights = NULL, model.augmentation.trt = NULL,
  model.augmentation.ctrl = NULL, stepwise.augmentation = FALSE,
  stepwise.weights = FALSE, nameMISS = "MISSING", nameY = "OUTCOME",
  sandwich.nuisance = FALSE, fay.adjustment = FALSE, fay.bound = 0.75)

Arguments

formula

an object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted.

id

a vector which identifies the clusters. The length of "id" should be the same as the number of observations. Data are assumed to be sorted so that observations on a cluster are contiguous rows for all entities in the formula.

data

an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. If not found in data, the variables are taken from environment(formula), typically the environment from which CRTgeeDR is called.

family

a description of the error distribution and link function to be used in the model. This can be a character string naming a family function, a family function or the result of a call to a family function. (See family for details of family functions.)

corstr

a character string specifying the correlation structure. The following are permitted: '"independence"', '"exchangeable"', '"ar1"', '"unstructured"' and '"userdefined"'

Mv

for "m-dependent", the value for m

weights

A vector of weights for each observation. If an observation has weight 0, it is excluded from the calculations of any parameters. Observations with a NA anywhere (even in variables not included in the model) will be assigned a weight of 0.

aug

A list of vector (one for A=1 treated, one for A=0 control) for each observation representing E(Y|X,A=a).

pi.a

A number, Probability of treatment attribution P(A=1)

corr.mat

The correlation matrix for "fixed". Matrix should be symmetric with dimensions >= the maximum cluster size. If the correlation structure is "userdefined", then this is a matrix describing which correlations are the same.

init.beta

an optional vector with the initial values of beta. If not specified, then the intercept will be set to InvLink(mean(response)). init.beta must be specified if not using an intercept.

init.alpha

an optional scalar or vector giving the initial values for the correlation. If provided along with Mv>1 or unstructured correlation, then the user must ensure that the vector is of the appropriate length.

init.phi

an optional initial overdispersion parameter. If not supplied, initialized to 1.

scale.fix

if set to TRUE, then the scale parameter is fixed at the value of init.phi.

sandwich

if set to TRUE, the sandwich variance is provided together with the naive estimator of variance.

maxit

maximum number of iterations.

tol

tolerance in calculation of coefficients.

print.log

if set to TRUE, a report is printed.

typeweights

a character string specifying the weights implementation. The following are permitted: "GENMOD" for W^{1/2}V^{-1}W^{1/2}, "WV" for V^{-1}W

nameTRT

Name of the variable containing information for the treatment

model.weights

an object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted for the propensity score. Must model the probability of being observed.

model.augmentation.trt

an object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted for the ouctome model for the treated group (A=1).

model.augmentation.ctrl

an object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted for the ouctome model for the control group (A=0).

stepwise.augmentation

if set to TRUE, a stepwise for the augmentation model is performed during the fit of the augmentation model for the OM

stepwise.weights

if set to TRUE, a stepwise for the propensity score is performed during the fit of the augmentation model for the OM

nameMISS

Name of the variable containing information for the Missing indicator

nameY

Name of the variable containing information for the outcome

sandwich.nuisance

if set to TRUE, the nuisance adjusted sandwich variance is provided.

fay.adjustment

if set to TRUE, the small-sample nuisance adjusted sandwich variance with Fay's adjustement is provided.

fay.bound

if set to 0.75 by default, bound value used for Fay's adjustement.

Details

The estimator is founds by solving:

0= \sum_{i=1}^M \Bigg[ \boldsymbol D_i^T \boldsymbol V_i^{-1} \boldsymbol W_i(\boldsymbol X_i, A_i, \boldsymbol \eta_W) \left( \boldsymbol Y_i - \boldsymbol B(\boldsymbol X_i, A_i, \boldsymbol \eta_B) \right)

\qquad + \sum_{a=0,1} p^a(1-p)^{1-a} \boldsymbol D_i^T \boldsymbol V_i^{-1} \Big( \boldsymbol B(\boldsymbol X_i,A_i=a, \boldsymbol \eta_B) -\boldsymbol \mu_i(\boldsymbol \beta,A_i=a)\Big) \Bigg]

where \boldsymbol D_i=\frac{\partial \boldsymbol \mu_i(\boldsymbol \beta,A_i)}{\partial \boldsymbol \beta^T} is the design matrix, \boldsymbol V_i is the covariance matrix equal to \boldsymbol U_i^{1/2} \boldsymbol C(\boldsymbol \alpha)\boldsymbol U_i^{1/2} with \boldsymbol U_i a diagonal matrix with elements {\rm var}(y_{ij}) and \boldsymbol C(\boldsymbol \alpha) is the working correlation structure with non-diagonal terms \boldsymbol \alpha. Parameters \boldsymbol \alpha are estimated using simple moment estimators from the Pearson residuals. The matrix of weights \boldsymbol W_i(\boldsymbol X_i, A_i, \boldsymbol \eta_W)=diag\left[R_{ij}/\pi_{ij}(\boldsymbol X_i, A_i, \boldsymbol \eta_W)\right]_{j=1,\dots,n_{i}}, where \pi_{ij}(\boldsymbol X_i, A_i, \boldsymbol \eta_W)=P(R_{ij}|\boldsymbol X_i, A_i) is the Propensity score (PS). The function \boldsymbol B(\boldsymbol X_i,A_i=a,\boldsymbol \eta_B), which is called the Outcome Model (OM), is a function linking Y_{ij} with \boldsymbol X_i and A_i. The \boldsymbol \eta_B are nuisance parameters that are estimated. The estimator is most efficient if the OM is equal to E(\boldsymbol Y_i|\boldsymbol X_i,A_i=a) The estimator denoted \hat{\beta}_{aug} is found by solving the estimating equation. Although analytic solutions sometimes exist, coefficient estimates are generally obtained using an iterative procedure such as the Newton-Raphson method. Automatic implementation is such that, \hat{ \boldsymbol \eta}_W in \boldsymbol W_i(\boldsymbol X_i, A_i, \hat{ \boldsymbol \eta}_W) are obtained using a logistic regression and \hat{ \boldsymbol \eta}_B in \boldsymbol B(\boldsymbol X_i,A_i,\hat{ \boldsymbol \eta}_B) are obtained using a linear regression.

The variance of \hat{\boldsymbol \beta}_{aug} is estimated by the sandwich variance estimator. There are two external sources of variability that need to be accounted for: estimation of \boldsymbol \eta_W for the PS and of \boldsymbol \eta_B for the OM. We denote \boldsymbol \Omega=(\boldsymbol \beta, \boldsymbol \eta_W,\boldsymbol \eta_B) the estimated parameters of interest and nuisance parameters. We can stack estimating functions and score functions for \boldsymbol \Omega:

\small \boldsymbol U_i(\boldsymbol \Omega)= \left( \begin{array}{c} \boldsymbol \Phi_i(\boldsymbol Y_i,\boldsymbol X_i,A_i,\boldsymbol \beta, \boldsymbol \eta_W, \boldsymbol \eta_B) \\ \boldsymbol S^W_i(\boldsymbol X_i, A_i, \boldsymbol \eta_W)\\ \boldsymbol S^B_i(\boldsymbol X_i, A_i, \boldsymbol \eta_B)\\ \end{array} \right)

where \boldsymbol S^W_i and \boldsymbol S^B_i represent the score equations for patients in cluster i for the estimation of \boldsymbol \eta_W and \boldsymbol \eta_B in the PS and the OM. A standard Taylor expansion paired with Slutzky's theorem and the central limit theorem give the sandwich estimator adjusted for nuisance parameters estimation in the OM and PS:

Var(\boldsymbol \Omega)={{E\left[\frac{\partial \boldsymbol U_i(\boldsymbol \Omega)}{\partial \boldsymbol \Omega}\right]}^{-1}}^{T} \underbrace{{E\left[ \boldsymbol U_i(\boldsymbol \Omega)\boldsymbol U_i^T(\boldsymbol \Omega) \right]}}_{\boldsymbol \Delta_{adj}} \underbrace{E\left[\frac{\partial \boldsymbol U_i(\boldsymbol \Omega)}{\partial \boldsymbol \Omega}\right]^{-1} }_{\boldsymbol \Gamma^{-1}_{adj}}.

Value

An object of type 'CRTgeeDR'

$beta Final values for regressors estimates

Author(s)

Melanie Prague [based on R packages 'geeM' L. S. McDaniel, N. C. Henderson, and P. J. Rathouz. Fast Pure R Implementation of GEE: Application of the Matrix Package. The R Journal, 5(1):181-188, June 2013.]

References

Details regarding implementation can be found in

Examples

data(data.sim)
 ## Not run: 
 #### STANDARD GEE
 geeresults<-geeDREstimation(formula=OUTCOME~TRT,
                               id="CLUSTER" , data = data.sim,
                               family = "binomial", corstr = "independence")
 summary(geeresults)
 #### IPW GEE
 ipwresults<-geeDREstimation(formula=OUTCOME~TRT,
                               id="CLUSTER" , data = data.sim,
                               family = "binomial", corstr = "independence",
                               model.weights=I(MISSING==0)~TRT*AGE)
 summary(ipwresults)
 #### AUGMENTED GEE
 augresults<-geeDREstimation(formula=OUTCOME~TRT,
                               id="CLUSTER" , data = data.sim,
                               family = "binomial", corstr = "independence",
                               model.augmentation.trt=OUTCOME~AGE,
                               model.augmentation.ctrl=OUTCOME~AGE, stepwise.augmentation=FALSE)
 summary(augresults)
 
## End(Not run)
 #### DOUBLY ROBUST
 drresults<-geeDREstimation(formula=OUTCOME~TRT,
                               id="CLUSTER" , data = data.sim,
                               family = "binomial", corstr = "independence",
                               model.weights=I(MISSING==0)~TRT*AGE,
                               model.augmentation.trt=OUTCOME~AGE,
                               model.augmentation.ctrl=OUTCOME~AGE, stepwise.augmentation=FALSE)
 summary(drresults)

[Package CRTgeeDR version 2.0.1 Index]