R: Non-parametric stochastic frontier

fit.sf {snfa}

R Documentation

Non-parametric stochastic frontier

Description

Fits stochastic frontier of data with kernel smoothing, imposing monotonicity and/or concavity constraints.

Usage

fit.sf(X, y, X.constrained = NA, H.inv = NA, H.mult = 1,
  method = "u", scale.constraints = TRUE)

Arguments

`X`	Matrix of inputs
`y`	Vector of outputs
`X.constrained`	Matrix of inputs where constraints apply
`H.inv`	Inverse of the smoothing matrix (must be positive definite); defaults to rule of thumb
`H.mult`	Scaling factor for rule of thumb smoothing matrix
`method`	Constraints to apply; "u" for unconstrained, "m" for monotonically increasing, and "mc" for monotonically increasing and concave
`scale.constraints`	Boolean, whether to scale constraints by their average value, can help with convergence

Details

This method fits non-parametric stochastic frontier models. The data-generating process is assumed to be of the form

\ln y_i = \ln f(x_i) + v_i - u_i,

where y_i is the ith observation of output, f is a continuous function, x_i is the ith observation of input, v_i is a normally-distributed error term (v_i\sim N(0, \sigma_v^2)), and u_i is a normally-distributed error term truncated below at zero (u_i\sim N^+(0, \sigma_u)). Aigner et al. developed methods to decompose \varepsilon_i = v_i - u_i into its basic components.

This procedure first fits the mean of the data using fit.mean, producing estimates of output \hat{y}. Log-proportional errors are calculated as

\varepsilon_i = \ln(y_i / \hat{y}_i).

Following Aigner et al. (1977), parameters of one- and two-sided error distributions are estimated via maximum likelihood. First,

\hat{\sigma}^2 = \frac1N \sum_{i=1}^N \varepsilon_i^2.

Then, \hat{\lambda} is estimated by solving

\frac1{\hat{\sigma}^2} \sum_{i=1}^N \varepsilon_i\hat{y}_i + \frac{\hat{\lambda}}{\hat{\sigma}} \sum_{i=1}^N \frac{f_i^*}{1 - F_i^*}y_i = 0,

where f_i^* and F_i^* are standard normal density and distribution function, respectively, evaluated at \varepsilon_i\hat{\lambda}\hat{\sigma}^{-1}. Parameters of the one- and two-sided distributions are found by solving the identities

\sigma^2 = \sigma_u^2 + \sigma_v^2

\lambda = \frac{\sigma_u}{\sigma_v}.

Mean efficiency over the sample is given by

\exp\left(-\frac{\sqrt{2}}{\sqrt{\pi}}\right)\sigma_u,

and modal efficiency for each observation is given by

-\varepsilon(\sigma_u^2/\sigma^2).

Value

Returns a list with the following elements

`y.fit`	Estimated value of the frontier at X.fit
`gradient.fit`	Estimated gradient of the frontier at X.fit
`mean.efficiency`	Average efficiency for X, y as a whole
`mode.efficiency`	Modal efficiencies for each observation in X, y
`X.eval`	Matrix of inputs used for fitting
`X.constrained`	Matrix of inputs where constraints apply
`X.fit`	Matrix of inputs where curve is fit
`H.inv`	Inverse smoothing matrix used in fitting
`method`	Method used to fit frontier
`scaling.factor`	Factor by which constraints are multiplied before quadratic programming

References

Aigner D, Lovell CK, Schmidt P (1977). “Formulation and estimation of stochastic frontier production function models.” Journal of econometrics, 6(1), 21–37.

Racine JS, Parmeter CF, Du P (2009). “Constrained nonparametric kernel regression: Estimation and inference.” Working paper.

Examples

data(USMacro)

USMacro <- USMacro[complete.cases(USMacro),]

#Extract data
X <- as.matrix(USMacro[,c("K", "L")])
y <- USMacro$Y

#Fit frontier
fit.sf <- fit.sf(X, y,
                 X.constrained = X,
                 method = "mc")

print(fit.sf$mean.efficiency)
# [1] 0.9772484

#Plot efficiency over time
library(ggplot2)

plot.df <- data.frame(Year = USMacro$Year,
                      Efficiency = fit.sf$mode.efficiency)

ggplot(plot.df, aes(Year, Efficiency)) +
  geom_line()

[Package snfa version 0.0.1 Index]