prodestLP {prodest} | R Documentation |
Estimate productivity - Levinsohn-Petrin method
Description
The prodestLP()
The prodestWRDG()
function accepts at least 6 objects (id, time, output, free, state and proxy variables), and returns a prod
object of class S3
with three elements: (i) a list of model-related objects, (ii) a list with the data used in the estimation and estimated vectors of first-stage residuals, and (iii) a list with the estimated parameters and their bootstrapped standard errors.
Usage
prodestLP(Y, fX, sX, pX, idvar, timevar, R = 20, cX = NULL,
opt = 'optim', theta0 = NULL, cluster = NULL, tol = 1e-100, exit = FALSE)
Arguments
Y |
the vector of value added log output. |
fX |
the vector/matrix/dataframe of log free variables. |
sX |
the vector/matrix/dataframe of log state variables. |
pX |
the vector/matrix/dataframe of log proxy variables. |
cX |
the vector/matrix/dataframe of control variables. By default |
idvar |
the vector/matrix/dataframe identifying individual panels. |
timevar |
the vector/matrix/dataframe identifying time. |
R |
the number of block bootstrap repetitions to be performed in the standard error estimation. By default |
opt |
a string with the optimization algorithm to be used during the estimation. By default |
theta0 |
a vector with the second stage optimization starting points. By default |
cluster |
an object of class |
tol |
optimizer tolerance. By default |
exit |
Indicator for attrition in the data - i.e., if firms exit the market. By default |
Details
Consider a Cobb-Douglas production technology for firm i
at time t
-
y_{it} = \alpha + w_{it}\beta + k_{it}\gamma + \omega_{it} + \epsilon_{it}
where y_{it}
is the (log) output, w_it a 1xJ vector of (log) free variables, k_it is a 1xK vector of state variables and \epsilon_{it}
is a normally distributed idiosyncratic error term.
The unobserved technical efficiency parameter \omega_{it}
evolves according to a first-order Markov process:
-
\omega_{it} = E(\omega_{it} | \omega_{it-1}) + u_{it} = g(\omega_{it-1}) + u_{it}
and u_{it}
is a random shock component assumed to be uncorrelated with the technical efficiency, the state variables in k_{it}
and the lagged free variables w_{it-1}
.
The LP method relies on the following set of assumptions:
a) firms immediately adjust the level of inputs according to demand function
m(\omega_{it}, k_{it})
after the technical efficiency shock realizes;b)
m_{it}
is strictly monotone in\omega_{it}
;c)
\omega_{it}
is scalar unobservable inm_{it} = m(.)
;d) the levels of
k_{it}
are decided at timet-1
; the level of the free variable,w_{it}
, is decided after the shocku_{it}
realizes.
Assumptions a)-d) ensure the invertibility of m_{it}
in \omega_{it}
and lead to the partially identified model:
-
y_{it} = \alpha + w_{it}\beta + k_{it}\gamma + h(m_{it}, k_{it}) + \epsilon_{it} = \alpha + w_{it}\beta + \phi(m_{it}, k_{it}) + \epsilon_{it}
which is estimated by a non-parametric approach - First Stage.
Exploiting the Markovian nature of the productivity process one can use assumption d) in order to set up the relevant moment conditions and estimate the production function parameters - Second stage.
Exploiting the residual \nu_{it}
of:
-
y_{it} - w_{it}\hat{\beta} = \alpha + k_{it}\gamma + g(\omega_{it-1}, \chi_{it}) + \nu_{it}
and g(.)
is typically left unspecified and approximated by a n^{th}
order polynomial and \chi_{it}
is an indicator function for the attrition in the market.
Value
The output of the function prodestLP
is a member of the S3
class prod. More precisely, is a list (of length 3) containing the following elements:
Model
, a list containing:
-
method:
a string describing the method ('LP'). -
boot.repetitions:
the number of bootstrap repetitions used for standard errors' computation. -
elapsed.time:
time elapsed during the estimation. -
theta0:
numeric object with the optimization starting points - second stage. -
opt:
string with the optimization routine used - 'optim', 'solnp' or 'DEoptim'. -
opt.outcome:
optimization outcome. -
FSbetas:
first stage estimated parameters.
Data
, a list containing:
-
Y:
the vector of value added log output. -
free:
the vector/matrix/dataframe of log free variables. -
state:
the vector/matrix/dataframe of log state variables. -
proxy:
the vector/matrix/dataframe of log proxy variables. -
control:
the vector/matrix/dataframe of log control variables. -
idvar:
the vector/matrix/dataframe identifying individual panels. -
timevar:
the vector/matrix/dataframe identifying time. -
FSresiduals:
numeric object with the residuals of the first stage.
Estimates
, a list containing:
-
pars:
the vector of estimated coefficients. -
std.errors:
the vector of bootstrapped standard errors.
Members of class prod
have an omega
method returning a numeric object with the estimated productivity - that is: \omega_{it} = y_{it} - (\alpha + w_{it}\beta + k_{it}\gamma)
.
FSres
method returns a numeric object with the residuals of the first stage regression, while summary
, show
and coef
methods are implemented and work as usual.
Author(s)
Gabriele Rovigatti
References
Levinsohn, J. and Petrin, A. (2003). "Estimating production functions using inputs to control for unobservables." The Review of Economic Studies, 70(2), 317-341.
Examples
require(prodest)
## Chilean data on production.
## Publicly available at http://www.ine.cl/canales/chile_estadistico/estadisticas_
## economicas/industria/series_estadisticas/series_estadisticas_enia.php
data(chilean)
# we fit a model with two free (skilled and unskilled), one state (capital)
# and one proxy variable (electricity)
set.seed(154673)
LP.fit <- prodestLP(chilean$Y, fX = cbind(chilean$fX1, chilean$fX2), chilean$sX,
chilean$pX, chilean$idvar, chilean$timevar)
LP.fit.solnp <- prodestLP(chilean$Y, fX = cbind(chilean$fX1, chilean$fX2), chilean$sX,
chilean$pX, chilean$idvar, chilean$timevar, opt = 'solnp')
## Not run:
# run the same model in parallel
require(parallel)
nCores <- as.numeric(Sys.getenv("NUMBER_OF_PROCESSORS"))
cl <- makeCluster(getOption("cl.cores", nCores - 1))
set.seed(154673)
LP.fit.par <- prodestLP(chilean$Y, fX = cbind(chilean$fX1, chilean$fX2),
chilean$sX, chilean$pX, chilean$idvar, chilean$timevar,
cluster = cl)
stopCluster(cl)
## End(Not run)
# show results
summary(LP.fit)
summary(LP.fit.solnp)
# show results in .tex tabular format
printProd(list(LP.fit, LP.fit.solnp))