LIVE {naivereg}R Documentation

Logistic-regression Instrumental Variables Estimator

Description

Binary endogenous variables are commonly encountered in program evaluations using observational data. This is a two-stage approach to estimate the dummy endogenous treatment effect using high-dimensional instrumental variables (IV). In the first stage, we use a penalized logistic reduced form model to accommodate both the binary nature of the endogenous treatment and the high-dimensionality of instrumental variables. In the second stage, we replace the original treatment variable by its estimated propensity score and run a least squares regression to obtain a penalized Logistic-regression Instrumental Variables Estimator (LIVE). If the structural equation model is also high-dimensional, one could use DS-LIVE in this package for selecting both the control variables and IVs.

Usage

LIVE(
  y,
  x,
  z,
  penalty = c("SCAD", "MCP", "lasso"),
  nfolds = 5,
  endogenous.index = c(),
  gamma = 3.7,
  alpha = 1,
  lambda.min = 0.05,
  nlambda = 100,
  ...
)

Arguments

y

Response variable, an N x 1 vector.

x

The design matrix, including endogenous variable, the value of endogenous variable is 0 or 1 (binary).

z

The instrumental variables matrix.

penalty

The penalty to be applied to the model. Either "SCAD" (the default), "MCP", or "lasso".

nfolds

The response number of folds - default is 5. Although nfolds can be as large as the sample size (leave-one-out CV), it is not recommended for large datasets. Smallest value allowable is nfolds=3.

endogenous.index

Specify which variables in design matrix are endogenous variables, the variable corresponds to the value 1 is endogenous variables, the variable corresponds to the value 0 is exogenous variable, the default is all endogenous variables.

gamma

The tuning parameter of the MCP/SCAD penalty. Default is 3.7.

alpha

Tuning parameter for the Mnet estimator which controls the relative contributions from the MCP/SCAD penalty and the ridge, or L2 penalty. alpha=1 is equivalent to MCP/SCAD penalty, while alpha=0 would be equivalent to ridge regression. However, alpha=0 is not supported; alpha may be arbitrarily small, but not exactly 0.

lambda.min

The smallest value for lambda, as a fraction of lambda.max, default is 0.05.

nlambda

The number of lambda values, default is 100.

...

other arguments.

Details

This is a two stage estimation. In the first stage, a high-dimensional logistic reduced form model with penalty (such as SCAD, lasso, etc.) is used to approximate the optimal instrument. In the second stage, we replace the original treatment variable by its estimated propensity score and run a least squares regression to obtain the penalized Logistic-regression Instrumental Variables Estimator (LIVE). The large dimensional IV could be the original variables or the functional transformations such as series, B-spline functions, etc.

Value

An object of type LIVE which is a list with the following components:

coefficients

The coefficients of x.

lambda.min

The value of lambda that gives minimum cvm.

ind

The selected variables of z.

Xhat

The xhat estimated by z.

IVnum

The number of instrumented variables after filtering.

penalty

Same as above.

alpha

Same as above.

Author(s)

Qingliang Fan, KongYu He, Wei Zhong

References

Wei Zhong, Wei Zhou, Qingliang Fan and Yang Gao (2020), “Dummy Endogenous Treatment Effect Estimation Using High-Dimensional Instrumental Variables”, working paper.

Examples

#Logistic-regression Instrumental Variables Estimator
data("LIVEdata")
y=LIVEdata[,1]
x=LIVEdata[,2]
z=LIVEdata[,3:52]
res = LIVE(y,x,z,penalty='SCAD',gamma = 3.7,alpha = 1,lambda.min = 0.05)

[Package naivereg version 1.0.5 Index]