dproxyme {factormodel}R Documentation

dproxyme

Description

This function estimates measurement stochastic matrices of discrete proxy variables.

Usage

dproxyme(
  dat,
  sbar = 2,
  initvar = 1,
  initvec = NULL,
  seed = 210313,
  tol = 0.005,
  maxiter = 200,
  miniter = 10,
  minobs = 100,
  maxiter2 = 1000,
  trace = FALSE,
  weights = NULL
)

Arguments

dat

A proxy variable data frame list.

sbar

A number of discrete types. Default is 2.

initvar

A column index of a proxy variable to initialize the EM algorithm. Default is 1. That is, the proxy variable in the first column of "dat" is used for initialization.

initvec

This vector defines how to group the initvar to initialize the EM algorithm.

seed

Seed. Default is 210313 (birthday of this package).

tol

A tolerance for EM algorithm. Default is 0.005.

maxiter

A maximum number of iterations for EM algorithm. Default is 200.

miniter

A minimum number of iterations for EM algorithm. Default is 10.

minobs

Compute likelihood of a proxy variable only if there are more than "minobs" observations. Default is 100.

maxiter2

Maximum number of iterations for "multinom". Default is 1000.

trace

Whether to trace EM algorithm progress. Default is FALSE.

weights

An optional weight vector

Value

Returns a list of 5 components :

M_param

This is a list of estimated measurement (stochastic) matrices. The k-th matrix is a measurement matrix of a proxy variable saved in the kth column of dat data frame (or matrix). The ij-th element in a measurement matrix is the conditional probability of observing j-th (largest) proxy response value conditional on that the latent type is i.

M_param_col

This is a list of column labels of 'M_param' matrices

M_param_row

This is a list of row labels of 'M_param' matrices. It is simply c(1:sbar).

mparam

This is a list of multinomial logit coefficients which were used to compute 'M_param' matrices. These coefficients are useful to compute the likelihood of proxy responses.

typeprob

This is a type probability matrix of size N-by-sbar. The ij-th entry of this matrix gives the probability of observation i to have type j.

Author(s)

Yujung Hwang, yujungghwang@gmail.com

References

Dempster, Arthur P., Nan M. Laird, and Donald B. Rubin (1977)

"Maximum likelihood from incomplete data via the EM algorithm." Journal of the Royal Statistical Society: Series B (Methodological) 39.1 : 1-22. doi: 10.1111/j.2517-6161.1977.tb01600.x

Hu, Yingyao (2008)

Identification and estimation of nonlinear models with misclassification error using instrumental variables: A general solution. Journal of Econometrics, 144(1), 27-61. doi: 10.1016/j.jeconom.2007.12.001

Hu, Yingyao (2017)

The econometrics of unobservables: Applications of measurement error models in empirical industrial organization and labor economics. Journal of Econometrics, 200(2), 154-168. doi: 10.1016/j.jeconom.2017.06.002

Hwang, Yujung (2021)

Identification and Estimation of a Dynamic Discrete Choice Models with Endogenous Time-Varying Unobservable States Using Proxies. Working Paper.

Hwang, Yujung (2021)

Bounding Omitted Variable Bias Using Auxiliary Data. Working Paper.

Examples

dat1 <- data.frame(proxy1=c(1,2,3),proxy2=c(2,3,4),proxy3=c(4,3,2))
## default minimum num of obs to run an EM algorithm is 10
dproxyme(dat=dat1,sbar=2,initvar=1,minobs=3)
## you can specify weights
dproxyme(dat=dat1,sbar=2,initvar=1,minobs=3,weights=c(0.1,0.5,0.4))



[Package factormodel version 1.0 Index]