R: Fitting a logistic regression model for propensity scores and...

get.weights {causalCmprsk}

R Documentation

Fitting a logistic regression model for propensity scores and estimating weights

Description

Fits a propensity scores model by logistic regression and returns both estimated propensity scores and requested weights. The estimated propensity scores can be used for further diagnostics, e.g. for testing a positivity assumption and covariate balance.

Usage

get.weights(formula, data, A, C = NULL, wtype = "unadj", case.w = NULL)

Arguments

`formula`	a formula expression, of the form `response ~ predictors`. The `response` is a binary treatment/exposure variable, for which a logistic regression model (a Propensity Scores model) will be fit using `glm`. See the documentation of `glm` and `formula` for details. As an alternative to specifying `formula`, arguments `A` and `C`, defined below, can be specified.
`data`	a data frame that includes a treatment indicator `A` and covariates `C` appearing in `formula`.
`A`	a character specifying the name of the treatment/exposure variable. It is assumed that `A` is a numeric binary indicator with 0/1 values, where `A`=1 is assumed a treatment group, and `A`=0 a control group.
`C`	a vector of character strings with variable names (potential confounders) in the logistic regression model for Propensity Scores, i.e. P(A=1\|C=c). The default value of `C` is NULL corresponding to `wtype`="unadj" that will estimate treatment effects in the raw (observed) data.
`wtype`	a character string variable indicating the type of weights that will define the target population for which the ATE will be estimated. The default is "unadj" - this will not adjust for possible treatment selection bias and will not use propensity scores weighting. It can be used, for example, in data from a randomized controlled trial (RCT) where there is no need for emulation of baseline randomization. Other possible values are "stab.ATE", "ATE", "ATT", "ATC" and "overlap". See Table 1 from Li, Morgan, and Zaslavsky (2018).
`case.w`	a vector of case weights.

Value

A list with the following fields:

wtype a character string indicating the type of the estimated weights
ps a vector of estimated propensity scores P(A=a|C=c)
w a vector of estimated weights
summary.glm a summary of the logistic regression fit which is done using stats::glm

function

References

F. Li, K.L. Morgan, and A.M. Zaslavsky. 2018. Balancing Covariates via Propensity Score Weighting. Journal of the American Statistical Association 113 (521): 390–400.

M.A. Hernán, B. Brumback, and J.M. Robins. 2000. Marginal structural models and to estimate the causal effect of zidovudine on the survival of HIV-positive men. Epidemiology, 11 (5): 561-570.

Examples

# create a data set
n <- 1000
set.seed(7)
c1 <- runif(n)
c2 <- as.numeric(runif(n)< 0.2)
set.seed(77)
cf.m.T1 <- rweibull(n, shape=1, scale=exp(-(-1 + 2*c1)))
cf.m.T2 <-  rweibull(n, shape=1, scale=exp(-(1 + 1*c2)))
cf.m.T <- pmin( cf.m.T1, cf.m.T2)
cf.m.E <- rep(0, n)
cf.m.E[cf.m.T1<=cf.m.T2] <- 1
cf.m.E[cf.m.T2<cf.m.T1] <- 2
set.seed(77)
cf.s.T1 <- rweibull(n, shape=1, scale=exp(-1*c1 ))
cf.s.T2 <-  rweibull(n, shape=1, scale=exp(-2*c2))
cf.s.T <- pmin( cf.s.T1, cf.s.T2)
cf.s.E <- rep(0, n)
cf.s.E[cf.s.T1<=cf.s.T2] <- 1
cf.s.E[cf.s.T2<cf.s.T1] <- 2
exp.z <- exp(0.5 + 1*c1 - 1*c2)
pr <- exp.z/(1+exp.z)
TRT <- ifelse(runif(n)< pr, 1, 0)
X <- ifelse(TRT==1, cf.m.T, cf.s.T)
E <- ifelse(TRT==1, cf.m.E, cf.s.E)
covs.names <- c("c1", "c2")
data <- data.frame(X=X, E=E, TRT=TRT, c1=c1, c2=c2)
form.txt <- paste0("TRT", " ~ ", paste0(c("c1", "c2"), collapse = "+"))
trt.formula <- as.formula(form.txt)
wei <- get.weights(formula=trt.formula, data=data, wtype = "overlap")
hist(wei$ps[data$TRT==1], col="red", breaks = seq(0,1,0.05))
par(new=TRUE)
hist(wei$ps[data$TRT==0], col="blue", breaks = seq(0,1,0.05))

# please see our package vignette for practical examples

[Package causalCmprsk version 2.0.0 Index]