treatweight {causalweight}R Documentation

Treatment evaluation based on inverse probability weighting with optional sample selection correction.

Description

Treatment evaluation based on inverse probability weighting with optional sample selection correction.

Usage

treatweight(
  y,
  d,
  x,
  s = NULL,
  z = NULL,
  selpop = FALSE,
  ATET = FALSE,
  trim = 0.05,
  logit = FALSE,
  boot = 1999,
  cluster = NULL
)

Arguments

y

Dependent variable.

d

Treatment, must be binary (either 1 or 0), must not contain missings.

x

Confounders of the treatment and outcome, must not contain missings.

s

Selection indicator. Must be one if y is observed (non-missing) and zero if y is not observed (missing). Default is NULL, implying that y does not contain any missings.

z

Optional instrumental variable(s) for selection s. If NULL, outcome selection based on observables (x,d) - known as "missing at random" - is assumed. If z is defined, outcome selection based on unobservables - known as "non-ignorable missingness" - is assumed. Default is NULL. If s is NULL, z is ignored.

selpop

Only to be used if both s and z are defined. If TRUE, the effect is estimated for the selected subpopulation with s=1 only. If FALSE, the effect is estimated for the total population. (note that this relies on somewhat stronger statistical assumptions). Default is FALSE. If s or z is NULL, selpop is ignored.

ATET

If FALSE, the average treatment effect (ATE) is estimated. If TRUE, the average treatment effect on the treated (ATET) is estimated. Default is FALSE.

trim

Trimming rule for discarding observations with extreme propensity scores. If ATET=FALSE, observations with Pr(D=1|X)<trim or Pr(D=1|X)>(1-trim) are dropped. If ATET=TRUE, observations with Pr(D=1|X)>(1-trim) are dropped. If s is defined and z is NULL, observations with extremely low selection propensity scores, Pr(S=1|D,X)<trim, are discarded, too. If s and z are defined, the treatment propensity scores to be trimmed change to Pr(D=1|X,Pr(S=1|D,X,Z)). If in addition selpop is FALSE, observation with Pr(S=1|D,X,Z)<trim are discarded, too. Default for trim is 0.05.

logit

If FALSE, probit regression is used for propensity score estimation. If TRUE, logit regression is used. Default is FALSE.

boot

Number of bootstrap replications for estimating standard errors. Default is 1999.

cluster

A cluster ID for block or cluster bootstrapping when units are clustered rather than iid. Must be numerical. Default is NULL (standard bootstrap without clustering).

Details

Estimation of treatment effects of a binary treatment under a selection on observables assumption assuming that all confounders of the treatment and the outcome are observed. Units are weighted by the inverse of their conditional treatment propensities given the observed confounders, which are estimated by probit or logit regression. Standard errors are obtained by bootstrapping the effect. If s is defined, the procedure allows correcting for sample selection due to missing outcomes based on the inverse of the conditional selection probability. The latter might either be related to observables, which implies a missing at random assumption, or in addition also to unobservables, if an instrument for sample selection is available. See Huber (2012, 2014) for further details.

Value

A treatweight object contains six components: effect, se, pval, y1, y0, and ntrimmed.

effect: average treatment effect (ATE) if ATET=FALSE or the average treatment effect on the treated (ATET) if ATET=TRUE.

se: bootstrap-based standard error of the effect.

pval: p-value of the effect.

y1: mean potential outcome under treatment.

y0: mean potential outcome under control.

ntrimmed: number of discarded (trimmed) observations due to extreme propensity score values.

References

Horvitz, D. G., and Thompson, D. J. (1952): "A generalization of sampling without replacement from a finite universe", Journal of the American Statistical Association, 47, 663–685.

Huber, M. (2012): "Identification of average treatment effects in social experiments under alternative forms of attrition", Journal of Educational and Behavioral Statistics, 37, 443-474.

Huber, M. (2014): "Treatment evaluation in the presence of sample selection", Econometric Reviews, 33, 869-905.

Examples

# A little example with simulated data (10000 observations)
## Not run: 
n=10000
x=rnorm(n); d=(0.25*x+rnorm(n)>0)*1
y=0.5*d+0.25*x+rnorm(n)
# The true ATE is equal to 0.5
output=treatweight(y=y,d=d,x=x,trim=0.05,ATET=FALSE,logit=TRUE,boot=19)
cat("ATE: ",round(c(output$effect),3),", standard error: ",
    round(c(output$se),3), ", p-value: ",round(c(output$pval),3))
output$ntrimmed
## End(Not run)
# An example with non-random outcome selection and an instrument for selection
## Not run: 
n=10000
sigma=matrix(c(1,0.6,0.6,1),2,2)
e=(2*rmvnorm(n,rep(0,2),sigma))
x=rnorm(n)
d=(0.5*x+rnorm(n)>0)*1
z=rnorm(n)
s=(0.25*x+0.25*d+0.5*z+e[,1]>0)*1
y=d+x+e[,2]; y[s==0]=0
# The true ATE is equal to 1
output=treatweight(y=y,d=d,x=x,s=s,z=z,selpop=FALSE,trim=0.05,ATET=FALSE,
       logit=TRUE,boot=19)
cat("ATE: ",round(c(output$effect),3),", standard error: ",
    round(c(output$se),3), ", p-value: ",round(c(output$pval),3))
output$ntrimmed
## End(Not run)

[Package causalweight version 1.1.1 Index]