R: Endogeneity test in high dimensions

endo.test {RobustIV}

R Documentation

Endogeneity test in high dimensions

Description

Conduct the endogeneity test with high dimensional and possibly invalid instrumental variables.

Usage

endo.test(
  Y,
  D,
  Z,
  X,
  intercept = TRUE,
  invalid = FALSE,
  method = c("Fast.DeLasso", "DeLasso", "OLS"),
  voting = c("MP", "MaxClique"),
  alpha = 0.05,
  tuning.1st = NULL,
  tuning.2nd = NULL
)

Arguments

`Y`	The outcome observation, a vector of length `n`.
`D`	The treatment observation, a vector of length `n`.
`Z`	The instrument observation of dimension `n \times p_z`.
`X`	The covariates observation of dimension `n \times p_x`.
`intercept`	Whether the intercept is included. (default = `TRUE`)
`invalid`	If `TRUE`, the method is robust to the presence of possibly invalid IVs; If `FALSE`, the method assumes all IVs to be valid. (default = `FALSE`)
`method`	The method used to estimate the reduced form parameters. `"OLS"` stands for ordinary least squares, `"DeLasso"` stands for the debiased Lasso estimator, and `"Fast.DeLasso"` stands for the debiased Lasso estimator with fast algorithm. (default = `"Fast.DeLasso"`)
`voting`	The voting option used to estimate valid IVs. `'MP'` stnads for majority and plurality voting, `'MaxClique'` stands for maximum clique in the IV voting matrix. (default = `'MP'`)
`alpha`	The significance level for the confidence interval. (default = `0.05`)
`tuning.1st`	The tuning parameter used in the 1st stage to select relevant instruments. If `NULL`, it will be generated data-dependently, see Details. (default=`NULL`)
`tuning.2nd`	The tuning parameter used in the 2nd stage to select valid instruments. If `NULL`, it will be generated data-dependently, see Details. (default=`NULL`)

Details

When voting = MaxClique and there are multiple maximum cliques, the null hypothesis is rejected if one of maximum clique rejects the null. As for tuning parameter in the 1st stage and 2nd stage, if do not specify, for method "OLS" we adopt \sqrt{\log n} for both tuning parameters, and for other methods we adopt \max{(\sqrt{2.01 \log p_z}, \sqrt{\log n})} for both tuning parameters.

Value

endo.test returns an object of class "endotest", which is a list containing the following components:

`Q`	The test statistic.
`Sigma12`	The estimated covaraince of the regression errors.
`SHat`	The set of selected relevant IVs.
`VHat`	The set of selected vaild IVs.
`p.value`	The p-value of the endogeneity test.
`check`	The indicator that `H_0:\Sigma_{12}=0` is rejected.

References

Guo, Z., Kang, H., Tony Cai, T. and Small, D.S. (2018), Testing endogeneity with high dimensional covariates, Journal of Econometrics, Elsevier, vol. 207(1), pages 175-187.

Examples


n = 500; L = 11; s = 3; k = 10; px = 10;
beta = 1; gamma = c(rep(1,k),rep(0,L-k))
phi<-(1/px)*seq(1,px)+0.5; psi<-(1/px)*seq(1,px)+1
epsilonSigma = matrix(c(1,0.8,0.8,1),2,2)
Z = matrix(rnorm(n*L),n,L)
X = matrix(rnorm(n*px),n,px)
epsilon = MASS::mvrnorm(n,rep(0,2),epsilonSigma)
D =  0.5 + Z %*% gamma + X %*% psi + epsilon[,1]
Y = -0.5 + Z %*% c(rep(1,s),rep(0,L-s)) + D * beta + X %*% phi + epsilon[,2]
endo.test.model <- endo.test(Y,D,Z,X,invalid = TRUE)
summary(endo.test.model)

[Package RobustIV version 0.2.5 Index]