dsfa {dsfa}R Documentation

dsfa: Distributional Stochastic Frontier Analysis


The dsfa package implements the specification, estimation and prediction of distributional stochastic frontier models via mgcv. The basic distributional stochastic frontier model is given by:

Y_n = \eta^\mu(\boldsymbol{x}_n^\mu) + V_n + s \cdot U_n

where n \in \{1,2,...,N\}. V_n and U_n are the noise and (in)efficiency respectively.

Let \theta_n be a parameter of the distribution of Y_n. Further, let g^{-1}_{\theta}(\cdot) be the monotonic response function, which links the additive predictor \eta(\boldsymbol{x}_n^\theta) to the parameter space for the parameter \theta_n via the additive model:

g^{-1}_{\theta}(\theta_n)=\eta(\boldsymbol{x}_n^\theta)=\beta^\theta_0 + \sum_{j^\theta=1}^{J^\theta} h^\theta_{j^\theta}(x^\theta_{nj^\theta})

Thus, the additive predictor \eta(\boldsymbol{x}_n^\theta) is made up by the intercept \beta^\theta_0 and J^\theta smooths terms. The mgcv packages provides a framework for fitting distributional regression models. The additive predictors can be defined via formulae in gam(). Within the formulae for the parameter \theta_n, the smooth function for the variable x^\theta_{nj^\theta} can be specified via the function s(), which is h^\theta_{j^\theta}(\cdot) in the notation above. The smooth functions may be:

An overview is provided at smooth.terms(). The functions gam(), predict.gam() and plot.gam(), are alike to the basic S functions. A number of other functions such as summary.gam(), residuals.gam and anova.gam are also provided, for extracting information from a fitted gamOject.

The main functions are:




#Set seed, sample size and type of function
N=500 #Sample size
s=-1 #Set to production function

#Generate covariates
x1<-runif(N,-1,1); x2<-runif(N,-1,1); x3<-runif(N,-1,1)
x4<-runif(N,-1,1); x5<-runif(N,-1,1)

#Set parameters of the distribution
mu=2+0.75*x1+0.4*x2+0.6*x2^2+6*log(x3+2)^(1/4) #production function parameter
sigma_v=exp(-1.5+0.75*x4) #noise parameter
sigma_u=exp(-1+sin(2*pi*x5)) #inefficiency parameter

#Simulate responses and create dataset
y<-rnormhnorm(n=N, mu=mu, sigma_v=sigma_v, sigma_u=sigma_u, s=s)
dat<-data.frame(y, x1, x2, x3, x4, x5)

#Write formulae for parameters
mu_formula<-y~x1+x2+I(x2^2)+s(x3, bs="ps")
sigma_u_formula<-~1+s(x5, bs="ps")

#Fit model
model<-mgcv::gam(formula=list(mu_formula, sigma_v_formula, sigma_u_formula),
                 data=dat, family=normhnorm(s=s), optimizer = c("efs"))

#Model summary

#Smooth effects
#Effect of x3 on the predictor of the production function
plot(model, select=1) #Estimated function
lines(x3[order(x3)], 6*log(x3[order(x3)]+2)^(1/4)-
      mean(6*log(x3[order(x3)]+2)^(1/4)), col=2) #True effect

#Effect of x5 on the predictor of the inefficiency
plot(model, select=2) #Estimated function
lines(x5[order(x5)], -1+sin(2*pi*x5)[order(x5)]-
      mean(-1+sin(2*pi*x5)),col=2) #True effect

#Estimate efficiency
efficiency(model, type="jondrow")
efficiency(model, type="battese")

#Get elasticities

[Package dsfa version 1.0.1 Index]