R: Estimate productivity - Levinsohn-Petrin method

prodestLP {prodest}

R Documentation

Estimate productivity - Levinsohn-Petrin method

Description

The prodestLP() The prodestWRDG() function accepts at least 6 objects (id, time, output, free, state and proxy variables), and returns a prod object of class S3 with three elements: (i) a list of model-related objects, (ii) a list with the data used in the estimation and estimated vectors of first-stage residuals, and (iii) a list with the estimated parameters and their bootstrapped standard errors.

Usage

  prodestLP(Y, fX, sX, pX, idvar, timevar, R = 20, cX = NULL,
            opt = 'optim', theta0 = NULL, cluster = NULL, tol = 1e-100, exit = FALSE)

Arguments

`Y`	the vector of value added log output.
`fX`	the vector/matrix/dataframe of log free variables.
`sX`	the vector/matrix/dataframe of log state variables.
`pX`	the vector/matrix/dataframe of log proxy variables.
`cX`	the vector/matrix/dataframe of control variables. By default `cX= NULL`.
`idvar`	the vector/matrix/dataframe identifying individual panels.
`timevar`	the vector/matrix/dataframe identifying time.
`R`	the number of block bootstrap repetitions to be performed in the standard error estimation. By default `R = 20`.
`opt`	a string with the optimization algorithm to be used during the estimation. By default `opt = 'optim'`.
`theta0`	a vector with the second stage optimization starting points. By default `theta0 = NULL` and the optimization is run starting from the first stage estimated parameters + `N(\mu=0,\sigma=0.01)` noise.
`cluster`	an object of class `"SOCKcluster"` or `"cluster"`. By default `cluster = NULL`.
`tol`	optimizer tolerance. By default `tol = 1e-100`.
`exit`	Indicator for attrition in the data - i.e., if firms exit the market. By default `exit = FALSE`; if `exit = TRUE`, an indicator function for firms whose last appearance is before the last observation's date is generated and used in the second stage. The user can even specify an indicator variable/matrix/dataframe with the exit years.

Details

Consider a Cobb-Douglas production technology for firm i at time t

y_{it} = \alpha + w_{it}\beta + k_{it}\gamma + \omega_{it} + \epsilon_{it}

where y_{it} is the (log) output, w_it a 1xJ vector of (log) free variables, k_it is a 1xK vector of state variables and \epsilon_{it} is a normally distributed idiosyncratic error term. The unobserved technical efficiency parameter \omega_{it} evolves according to a first-order Markov process:

\omega_{it} = E(\omega_{it} | \omega_{it-1}) + u_{it} = g(\omega_{it-1}) + u_{it}

and u_{it} is a random shock component assumed to be uncorrelated with the technical efficiency, the state variables in k_{it} and the lagged free variables w_{it-1}. The LP method relies on the following set of assumptions:

a) firms immediately adjust the level of inputs according to demand function m(\omega_{it}, k_{it}) after the technical efficiency shock realizes;
b) m_{it} is strictly monotone in \omega_{it};
c) \omega_{it} is scalar unobservable in m_{it} = m(.) ;
d) the levels of k_{it} are decided at time t-1; the level of the free variable, w_{it}, is decided after the shock u_{it} realizes.

Assumptions a)-d) ensure the invertibility of m_{it} in \omega_{it} and lead to the partially identified model:

y_{it} = \alpha + w_{it}\beta + k_{it}\gamma + h(m_{it}, k_{it}) + \epsilon_{it} = \alpha + w_{it}\beta + \phi(m_{it}, k_{it}) + \epsilon_{it}

which is estimated by a non-parametric approach - First Stage. Exploiting the Markovian nature of the productivity process one can use assumption d) in order to set up the relevant moment conditions and estimate the production function parameters - Second stage. Exploiting the residual \nu_{it} of:

y_{it} - w_{it}\hat{\beta} = \alpha + k_{it}\gamma + g(\omega_{it-1}, \chi_{it}) + \nu_{it}

and g(.) is typically left unspecified and approximated by a n^{th} order polynomial and \chi_{it} is an indicator function for the attrition in the market.

Value

The output of the function prodestLP is a member of the S3 class prod. More precisely, is a list (of length 3) containing the following elements:

Model, a list containing:

method: a string describing the method ('LP').
boot.repetitions: the number of bootstrap repetitions used for standard errors' computation.
elapsed.time: time elapsed during the estimation.
theta0: numeric object with the optimization starting points - second stage.
opt: string with the optimization routine used - 'optim', 'solnp' or 'DEoptim'.
opt.outcome: optimization outcome.
FSbetas: first stage estimated parameters.

Data, a list containing:

Y: the vector of value added log output.
free: the vector/matrix/dataframe of log free variables.
state: the vector/matrix/dataframe of log state variables.
proxy: the vector/matrix/dataframe of log proxy variables.
control: the vector/matrix/dataframe of log control variables.
idvar: the vector/matrix/dataframe identifying individual panels.
timevar: the vector/matrix/dataframe identifying time.
FSresiduals: numeric object with the residuals of the first stage.

Estimates, a list containing:

pars: the vector of estimated coefficients.
std.errors: the vector of bootstrapped standard errors.

Members of class prod have an omega method returning a numeric object with the estimated productivity - that is: \omega_{it} = y_{it} - (\alpha + w_{it}\beta + k_{it}\gamma). FSres method returns a numeric object with the residuals of the first stage regression, while summary, show and coef methods are implemented and work as usual.

Author(s)

Gabriele Rovigatti

References

Levinsohn, J. and Petrin, A. (2003). "Estimating production functions using inputs to control for unobservables." The Review of Economic Studies, 70(2), 317-341.

Examples


    require(prodest)

    ## Chilean data on production.
    ## Publicly available at http://www.ine.cl/canales/chile_estadistico/estadisticas_
    ## economicas/industria/series_estadisticas/series_estadisticas_enia.php

    data(chilean)

    # we fit a model with two free (skilled and unskilled), one state (capital)
    # and one proxy variable (electricity)

    set.seed(154673)
    LP.fit <- prodestLP(chilean$Y, fX = cbind(chilean$fX1, chilean$fX2), chilean$sX,
                        chilean$pX, chilean$idvar, chilean$timevar)
    LP.fit.solnp <- prodestLP(chilean$Y, fX = cbind(chilean$fX1, chilean$fX2), chilean$sX,
                        chilean$pX, chilean$idvar, chilean$timevar, opt = 'solnp')

    ## Not run: 
      # run the same model in parallel
      require(parallel)
      nCores <- as.numeric(Sys.getenv("NUMBER_OF_PROCESSORS"))
      cl <- makeCluster(getOption("cl.cores", nCores - 1))
      set.seed(154673)
      LP.fit.par <- prodestLP(chilean$Y, fX = cbind(chilean$fX1, chilean$fX2),
                      chilean$sX, chilean$pX, chilean$idvar, chilean$timevar,
                    cluster = cl)
      stopCluster(cl)
    
## End(Not run)

    # show results
    summary(LP.fit)
    summary(LP.fit.solnp)

    # show results in .tex tabular format
     printProd(list(LP.fit, LP.fit.solnp))

[Package prodest version 1.0.1 Index]