R: The inverse weighting estimator (nonparametric method)

iw_est {causaldrf}

R Documentation

The inverse weighting estimator (nonparametric method)

Description

This is a nonparametric method that estimates the ADRF by using a local linear regression of Y on treat with weighted kernel function. For details, see Flores et. al. (2012).

Usage

iw_est(Y,
       treat,
       treat_formula,
       data,
       grid_val,
       bandw,
       treat_mod,
       link_function,
       ...)

Arguments

`Y`	is the the name of the outcome variable contained in `data`.
`treat`	is the name of the treatment variable contained in `data`.
`treat_formula`	an object of class "formula" (or one that can be coerced to that class) that regresses `treat` on a linear combination of `X`: a symbolic description of the model to be fitted.
`data`	is a dataframe containing `Y`, `treat`, and `X`.
`grid_val`	contains the treatment values to be evaluated.
`bandw`	is the bandwidth. Default is 1.
`treat_mod`	a description of the error distribution to be used in the model for treatment. Options include: `"Normal"` for normal model, `"LogNormal"` for lognormal model, `"Sqrt"` for square-root transformation to a normal treatment, `"Poisson"` for Poisson model, `"NegBinom"` for negative binomial model, `"Gamma"` for gamma model.
`link_function`	is either "log", "inverse", or "identity" for the "Gamma" `treat_mod`.
`...`	additional arguments to be passed to the treatment regression function.

Details

The ADRF is estimated by

(D_{0}(t) S_{2}(t) - D_{1}(t) S_{1}(t)) / (S_{0}(t) S_{2}(t) - S_{1}^{2}(t))

where

D_{j}(t) = \sum_{i = 1}^{N} \tilde{K}_{h, X} (T_i - t) (T_i - t)^j Y_i

and S_{j}(t) = \sum_{i = 1}^{N} \tilde{K}_{h, X} (T_i - t) (T_i - t)^j \tilde{K}_{h, X}(t) = K_{h}(t) / \hat{R}_i(t) which is a local linear regression. More details are given in Flores (2012).

Value

iw_est returns an object of class "causaldrf", a list that contains the following components:

`param`	parameter estimates for a iw fit.
`t_mod`	the result of the treatment model fit.
`call`	the matched call.

References

Schafer, J.L., Galagate, D.L. (2015). Causal inference with a continuous treatment and outcome: alternative estimators for parametric dose-response models. Manuscript in preparation.

Flores, Carlos A., et al. "Estimating the effects of length of exposure to instruction in a training program: the case of job corps." Review of Economics and Statistics 94.1 (2012): 153-171.

Examples

## Example from Schafer (2015).

example_data <- sim_data

iw_list <- iw_est(Y = Y,
                treat = T,
                treat_formula = T ~ B.1 + B.2 + B.3 + B.4 + B.5 + B.6 + B.7 + B.8,
                data = example_data,
                grid_val = seq(8, 16, by = 1),
                bandw = bw.SJ(example_data$T),
                treat_mod = "Normal")

sample_index <- sample(1:1000, 100)

plot(example_data$T[sample_index],
      example_data$Y[sample_index],
      xlab = "T",
      ylab = "Y",
      main = "iw estimate")

lines(seq(8, 16, by = 1),
        iw_list$param,
        lty = 2,
        lwd = 2,
        col = "blue")

legend('bottomright',
        "iw estimate",
        lty=2,
        lwd = 2,
        col = "blue",
        bty='Y',
        cex=1)

rm(example_data, iw_list, sample_index)

## Example from Imai & van Dyk (2004).

data("nmes_data")
head(nmes_data)
# look at only people with medical expenditures greater than 0
nmes_nonzero <- nmes_data[which(nmes_data$TOTALEXP > 0), ]


iw_list <- iw_est(Y = TOTALEXP,
                  treat = packyears,
                  treat_formula = packyears ~ LASTAGE + I(LASTAGE^2) +
                    AGESMOKE + I(AGESMOKE^2) + MALE + RACE3 + beltuse +
                    educate + marital + SREGION + POVSTALB,
                  data = nmes_nonzero,
                  grid_val = seq(5, 100, by = 5),
                  bandw = bw.SJ(nmes_nonzero$packyears),
                  treat_mod = "LogNormal")

set.seed(307)
sample_index <- sample(1:nrow(nmes_nonzero), 1000)

plot(nmes_nonzero$packyears[sample_index],
     nmes_nonzero$TOTALEXP[sample_index],
     xlab = "packyears",
     ylab = "TOTALEXP",
     main = "iw estimate",
     ylim = c(0, 10000),
     xlim = c(0, 100))

lines(seq(5, 100, by = 5),
      iw_list$param,
      lty = 2,
      lwd = 2,
      col = "blue")

legend('topright',
       "iw estimate",
       lty=2,
       lwd = 2,
       col = "blue",
       bty='Y',
       cex = 1)
abline(0, 0)

[Package causaldrf version 0.4.2 Index]