ivr {desk}R Documentation

Two-Stage Least Squares (2SLS) Instrumental Variable Regression

Description

Performs a two-stage least squares regression on a single equation including endogenous regressors Y and exogenous regressors X on the right hand-side. Note that by specifying the set of endogenous regressors Y by endog the set of remaining regressors X are assumed to be exogenous and therefore automatically considered as part of the instrument in the first stage of the 2SLS. These variables are not to be specified in the iv argument. Here only instrumental variables outside the equation under consideration are specified.

Usage

ivr(formula, data = list(), endog, iv, contrasts = NULL, details = FALSE, ...)

Arguments

formula

model formula.

data

name of the data frame used. To be specified if variables are not stored in environment.

endog

character vector of endogenous (to be instrumented) regressors.

iv

character vector of predetermined/exogenous instrumental variables NOT already included in the model formula.

contrasts

an optional list. See the contrasts.arg of model.matrix.default.

details

logical value indicating whether details should be printed out by default.

...

further arguments that lm.fit() understands.

Value

A list object including:

adj.r.squ adjusted coefficient of determination (adj. R-squared).
coefficients IV-estimators of model parameters.
data/model matrix of the variables' data used.
data.name name of the data frame used.
df degrees of freedom in the model (number of observations minus rank).
exogenous exogenous regressors.
f.hausman exogeneity test: F-value for simultaneous significance of all instrument parameters. If H0: "Instruments are exogenous" is rejected, usage of IV-regression can be justified against OLS.
f.instr weak instrument test: F-value for significance of instrument parameter in first stage of 2SLS regression. If H0: "Instrument is weak" is rejected, instruments are usually considered sufficiently strong.
fitted.values fitted values of the IV-regression.
fsd first stage diagnostics (weakness of instruments).
has.const logical value indicating whether model has a constant (internal purposes).
instrumented name of instrumented regressors.
instruments name of instruments.
model.matrix the model (design) matrix.
ncoef integer, giving the rank of the model (number of coefficients estimated).
nobs number of observations.
p.hausman according p-value of exogeneity test.
p.instr according p-value of weak instruments test.
p.values vector of p-values of single parameter significance tests.
r.squ coefficient of determination (R-squared).
residuals residuals in the IV-regression.
response the endogenous (response) variable.
shea Shea's partial R-squared quantifying the ability to explain the endogenous regressors.
sig.squ estimated error variance (sigma-squared).
ssr sum of squared residuals.
std.err vector of standard errors of the parameter estimators.
t.values vector of t-values of single parameter significance tests.
ucov the (unscaled) variance-covariance matrix of the model's estimators.
vcov the (scaled) variance-covariance matrix of the model's estimators.
modform the model's regression R-formula.

References

Auer, L.v. (2023): Ökonometrie - Eine Einführung, 8th ed., Springer-Gabler (https://www.oekonometrie-lernen.de).

Wooldridge, J.M. (2013): Introductory Econometrics: A Modern Approach, 5th Edition, Cengage Learning, Datasets available for download at Cengage Learning

Examples

## Numerical Illustration 20.1 in Auer (2023)
ivr(contr ~ score, endog = "score", iv = "contrprev", data = data.insurance, details = TRUE)

## Replicating an example of Ani Katchova (econometric academy)
## (https://www.youtube.com/watch?v=lm3UvcDa2Hc)
## on U.S. Women's Labor-Force Participation (data from Wooldridge 2013)
library(wooldridge)
data(mroz)

# Select only working women
mroz = mroz[mroz$"inlf" == 1,]
mroz = mroz[, c("lwage", "educ", "exper", "expersq", "fatheduc", "motheduc")]
attach(mroz)

# Regular ols of lwage on educ, where educ is suspected to be endogenous
# hence estimators are biased
ols(lwage ~ educ, data = mroz)

# Manual calculation of ols coeff
Sxy(educ, lwage)/Sxy(educ)

# Manual calculation of iv regression coeff
# with fatheduc as instrument for educ
Sxy(fatheduc, lwage)/Sxy(fatheduc, educ)

# Calculation with 2SLS
educ_hat = ols(educ ~ fatheduc)$fitted
ols(lwage ~ educ_hat)

# Verify that educ_hat is completely determined by values of fatheduc
head(cbind(educ,fatheduc,educ_hat), 10)

# Calculation with ivr()
ivr(lwage ~ educ, endog = "educ", iv = "fatheduc", data = mroz, details = TRUE)

# Multiple regression model with 1 endogenous regressor (educ)
# and two exogenous regressors (exper, expersq)

# Biased ols estimation
ols(lwage ~ educ + exper + expersq, data = mroz)

# Unbiased 2SLS estimation with fatheduc and motheduc as instruments
# for the endogenous regressor educ
ivr(lwage ~ educ + exper + expersq,
    endog = "educ", iv = c("fatheduc", "motheduc"),
    data = mroz)

# Manual 2SLS
# First stage: Regress endog. regressor on all exogen. regressors
# and instruments -> get exogenous part of educ
stage1.mod = ols(educ ~ exper + expersq + fatheduc + motheduc)
educ_hat = stage1.mod$fitted

# Second stage: Replace endog regressor with predicted value educ_hat
# See the uncorrected standard errors!
stage2.mod = ols(lwage ~ educ_hat + exper + expersq, data = mroz)

## Simple test for endogeneity of educ:
## Include endogenous part of educ into model and see if it is signif.
## (is signif. at 10% level)
uhat = ols(educ ~ exper + expersq + fatheduc + motheduc)$resid
ols(lwage ~ educ + exper + expersq + uhat)
detach(mroz)


[Package desk version 1.1.1 Index]