lmw_iv {lmw} | R Documentation |
Compute instrumental variable regression-implied weights
Description
Computes the weights implied by an instrumental variable (IV) model that would estimate a weighted difference in outcome means equal to the treatment effect resulting from the supplied model fit with two-stage least squares.
Usage
lmw_iv(
formula,
data = NULL,
estimand = "ATE",
method = "URI",
treat = NULL,
iv,
base.weights = NULL,
s.weights = NULL,
obj = NULL,
fixef = NULL,
target = NULL,
target.weights = NULL,
contrast = NULL,
focal = NULL
)
Arguments
formula |
a one-sided formula with the treatment and covariates on the right-hand side corresponding to the second-stage (reduced form) outcome regression model to be fit. If an outcome variable is supplied on the left-hand side, it will be ignored. This model should not include an IV. See Details for how this formula is interpreted in light of other options. |
data |
a data frame containing the variables named in |
estimand |
the estimand of interest, which determines how covariates
are centered. Should be one of |
method |
the method used to estimate the weights; either |
treat |
the name of the treatment variable in |
iv |
a character vector or one-sided formula containing the names of
the IVs in |
base.weights |
a vector of base weights. See Details. If omitted and
|
s.weights |
a vector of sampling weights. See Details. If omitted and
|
obj |
a |
fixef |
optional; a string or one-sided formula containing the name of
the fixed effects variable in |
target |
a list or data frame containing the target values for each
covariate included in |
target.weights |
a vector of sampling weights to be applied to
|
contrast |
ignored. |
focal |
the level of the treatment variable to be considered "focal"
(i.e., the "treated" level when |
Details
lmw_iv()
computes weights that make the weighted difference in
outcome means between the treatment groups equal to the two-stage least
squares (2SLS) estimate of the treatment effect. formula
corresponds
to the second-stage (reduced form) model, with the treatment replaced by its
fitted values resulting from the first stage model. The first stage is fit
by replacing the treatment in the supplied formula
with the IVs named
in iv
and using the treatment as the outcome. The treatment is
assumed to be endogenous and the supplied instrumental variables assumed to
be instruments conditional on the other covariates, which are assumed to to
be exogenous.
When any treatment-by-covariate interactions are present in formula
or when method = "MRI"
, covariates are centered at specific values to
ensure the resulting weights correspond to the desired estimand as supplied
to the estimand
argument. For the ATE, the covariates are centered at
their means in the full sample. For the ATT and ATC, the covariates are
centered at their means in the treatment or control group (i.e., the
focal
group), respectively. For the CATE, the covariates are centered
according to the argument supplied to target
(see below). Note that
when covariate-by-covariate interactions are present, they will be centered
after computing the interaction rather than the interaction being computed
on the centered covariates unless estimand = "CATE"
, in which case
the covariates will be centered at the values specified in target
prior to involvement in interactions. Note that the resulting effect estimate
does not actually correspond to the estimand supplied unless all effect
heterogeneity is due to the included covariates.
When treatment-by-covariate interactions are included in formula
,
additional instruments will be formed as the product of the supplied IVs and
the interacting covariates. When method = "MRI"
, instruments will be
formed as the product of the supplied IVs and each of the covariates. All
treatment-by-covariate interactions are considered endogenous.
Base weights and sampling weights
Base weights (base.weights
) and sampling weights (s.weights
)
are similar in that they both involve combining weights with an outcome
regression model. However, they differ in a few ways. Sampling weights are
primarily used to adjust the target population; when the outcome model is
fit, it is fit using weighted least squares, and when target balance is
assessed, it is assessed using the sampling weighted population as the
target population. Centering of covariates in the outcome model is done
using the sampling weighted covariate means. Base weights are primarily used
to offer a second level of balancing beyond the implied regression weights,
i.e., to fit the 2SLS models in the base-weighted sample. Base weights do
not change the target population, so when target balance is assessed, it is
assessed using the unweighted population as the target population.
Some forms of weights both change the target population and provide an extra
layer of balancing, like propensity score weights that target estimands
other than the ATT, ATC, or ATE (e.g., overlap weights), or matching weights
where the target population is defined by the matching (e.g., matching with
a caliper, cardinality matching, or coarsened exact matching). Because these
weights change the target population, they should be supplied to
s.weights
to ensure covariates are appropriately centered. In
lmw_iv()
, whether weights are supplied to base.weights
or
s.weights
will not matter for the estimation of the weights but will
affect the target population in balance assessment.
When both base.weights
and s.weights
are supplied, e.g., when
the base weights are the result of a propensity score model fit with
sampling weights, it is assumed the base weights do not incorporate the
sampling weights; that is, it is assumed that to estimate a treatment effect
without regression adjustment, the base weights and the sampling
weights would have to be multiplied together. This is true, for example, for
the weights in a matchit
or weightit
object (see below) but
not for weights in the output of MatchIt::match.data()
unless called
with include.s.weights = FALSE
or weights resulting from
CBPS::CBPS()
.
2SLS after using MatchIt or WeightIt
Instrumental variable regression weights can be computed in a matched or weighted sample
by supplying a matchit
or weightit
object (from MatchIt
or WeightIt, respectively) to the obj
argument of lmw()
.
The estimand, base weights, and sampling weights (if any) will be taken from
the supplied object and used in the calculation of the implied regression
weights, unless these have been supplied separately to lmw_iv()
. The
weights
component of the supplied object containing the matching or
balancing weights will be passed to base.weights
and the
s.weights
component will be passed to s.weights
. Arguments
supplied to lmw_iv()
will take precedence over the corresponding
components in the obj
object.
Multi-category treatments
Multi-category treatments are not
currently supported for lmw_iv()
.
Fixed effects
A fixed effects variable can be supplied to the
fixef
argument. This is equivalent to adding the fixed effects
variable as an exogenous predictor that does not interact with the
treatment, IV, or any other covariate. The difference is that computation is
much faster when the fixed effect has many levels because demeaning is used
rather than including the fixed effect variable as a collection of dummy
variables. When using URI, the weights will be the same regardless of
whether the fixed effect variable is included as a covariate or supplied to
fixef
; when using MRI, results will differ because the fixed effect
variable does not interact with treatment. The fixed effects variable will
not appear in the summary.lmw()
output (but can be added using
addlvariables
argument) or in the model output of lmw_est()
or
summary.lmw_est()
. Because it does not interact with the
treatment, the distribution of the fixed effect variable may not correspond
to the target population, so caution should be used if it is expected the
treatment effect varies across levels of this variable (in which case it
should be included as a predictor). Currently only one fixed effect variable
is allowed.
Value
An lmw_iv
object, which inherits from lmw
objects and
contains the following components:
treat |
the treatment variable, given as a factor. |
weights |
the computed implied regression weights. |
covs |
a data frame containing the covariates included the model formula. |
estimand |
the requested estimand. |
method |
the method
used to estimate the weights ( |
base.weights |
the weights supplied to |
s.weights |
the weights supplied to |
call |
the
original call to |
fixef |
the fixed effects variable
if supplied to |
formula |
the model formula. |
iv |
the instrumental variables, given as a one-sided formula. |
target |
the supplied covariate target values when |
contrast |
the contrasted treatment groups. |
focal |
the focal treatment levels when
|
All functions that lack a specific lmw_iv
method work with
lmw_iv
objects as they do for lmw
objects, such as
summary.lmw()
, plot.lmw()
, etc.
References
Chattopadhyay, A., & Zubizarreta, J. R. (2023). On the implied weights of linear regression for causal inference. Biometrika, 110(3), 615–629. doi:10.1093/biomet/asac058
See Also
summary.lmw()
for summarizing balance and
representativeness; plot.lmw()
for plotting features of the
weights; lmw_est()
for estimating treatment effects from
lmw_iv
objects; influence.lmw()
for influence measures;
ivreg()
in the ivreg package for fitting 2SLS models.
Examples
# URI for the ATT using instrument `Ins`
lmw.out <- lmw_iv(~ treat + age + education + race +
re74, data = lalonde,
estimand = "ATT", method = "URI",
treat = "treat", iv = ~Ins)
lmw.out
summary(lmw.out, addlvariables = ~married + re75)