linear_probit {endogeneity}R Documentation

Recursive Linear-Probit Model

Description

Estimate linear and probit models with bivariate normally distributed error terms.

First stage (Linear):

m_i=\boldsymbol{\alpha}'\mathbf{w_i}+\sigma u_i

Second stage (Probit):

y_i = 1(\boldsymbol{\beta}'\mathbf{x_i} + {\gamma}m_i + v_i>0)

Endogeneity structure: u_i and v_i are bivariate normally distributed with a correlation of \rho.

The identification of this model requires an instrumental variable that appears in w but not x. This model still works if the first-stage dependent variable is not a regressor in the second stage.

Usage

linear_probit(
  form_linear,
  form_probit,
  data = NULL,
  par = NULL,
  method = "BFGS",
  init = c("zero", "unif", "norm", "default")[4],
  verbose = 0
)

Arguments

form_linear

Formula for the linear model

form_probit

Formula for the probit model

data

Input data, a data frame

par

Starting values for estimates

method

Optimization algorithm. Default is BFGS

init

Initialization method

verbose

A integer indicating how much output to display during the estimation process.

  • <0 - No ouput

  • 0 - Basic output (model estimates)

  • 1 - Moderate output, basic ouput + parameter and likelihood in each iteration

  • 2 - Extensive output, moderate output + gradient values on each call

Value

A list containing the results of the estimated model, some of which are inherited from the return of maxLik

Note that the list inherits all the components in the output of maxLik. See the documentation of maxLik for more details.

References

Peng, Jing. (2023) Identification of Causal Mechanisms from Randomized Experiments: A Framework for Endogenous Mediation Analysis. Information Systems Research, 34(1):67-84. Available at https://doi.org/10.1287/isre.2022.1113

See Also

Other endogeneity: bilinear(), biprobit_latent(), biprobit_partial(), biprobit(), pln_linear(), pln_probit(), probit_linearRE(), probit_linear_latent(), probit_linear_partial(), probit_linear()

Examples

library(MASS)
N = 2000
rho = -0.5
set.seed(1)

x = rbinom(N, 1, 0.5)
z = rnorm(N)

e = mvrnorm(N, mu=c(0,0), Sigma=matrix(c(1,rho,rho,1), nrow=2))
e1 = e[,1]
e2 = e[,2]

m = 1 + x + z + e1
y = as.numeric(1 + x + m + e2 > 0)

est = linear_probit(m~x+z, y~x+m)
print(est$estimates, digits=3)

[Package endogeneity version 2.1.3 Index]