R: DEA efficiency

dea {Benchmarking}

R Documentation

DEA efficiency

Description

Estimates a DEA frontier and calculates efficiency measures a la Farrell.

Usage

dea(X, Y, RTS="vrs", ORIENTATION="in", XREF=NULL, YREF=NULL,
    FRONT.IDX=NULL, SLACK=FALSE, DUAL=FALSE, DIRECT=NULL, param=NULL,
    TRANSPOSE=FALSE, FAST=FALSE, LP=FALSE, CONTROL=NULL, LPK=NULL)

## S3 method for class 'Farrell'
print(x, digits=4, ...) 
## S3 method for class 'Farrell'
summary(object, digits=4, ...)

Arguments

X

Inputs of firms to be evaluated, a K x m matrix of observations of K firms with m inputs (firm x input). In case TRANSPOSE=TRUE the input matrix is transposed to input x firm.

Y

Outputs of firms to be evaluated, a K x n matrix of observations of K firms with n outputs (firm x input). In case TRANSPOSE=TRUE the output matrix is transposed to output x firm.

RTS

Text string or a number defining the underlying DEA technology / returns to scale assumption.

0	fdh	Free disposability hull, no convexity assumption
1	vrs	Variable returns to scale, convexity and free disposability
2	drs	Decreasing returns to scale, convexity, down-scaling and free disposability
3	crs	Constant returns to scale, convexity and free disposability
4	irs	Increasing returns to scale, (up-scaling, but not down-scaling), convexity and free disposability
5	irs2	Increasing returns to scale (up-scaling, but not down-scaling), additivity, and free disposability
6	add	Additivity (scaling up and down, but only with integers), and free disposability; also known af replicability and free disposability, the free disposability and replicability hull (frh) -- no convexity assumption
7	fdh+	A combination of free disposability and restricted or local constant return to scale
10	vrs+	As vrs, but with restrictions on the individual lambdas via `param`

ORIENTATION

Input efficiency "in" (1), output efficiency "out" (2), and graph efficiency "graph" (3). For use with DIRECT, an additional option is "in-out" (0).

XREF

Inputs of the firms determining the technology, defaults to X

YREF

Outputs of the firms determining the technology, defaults to Y

FRONT.IDX

Index for firms determining the technology

SLACK

Calculate slack in a phase II calculation by an intern call of the function slack. Note that the precision for calculating slacks for orientation graph is low.

DUAL

Calculate dual variables, i.e. shadow prices; not calculated for orientation graph as that is not an LP problem.

DIRECT

Directional efficiency, DIRECT is either a scalar, an array, or a matrix with non-negative elements.

If the argument is a scalar, the direction is (1,1,...,1) times the scalar; the value of the efficiency depends on the scalar as well as on the unit of measurements.

If the argument is an array, this is used for the direction for every firm; the length of the array must correspond to the number of inputs and/or outputs depending on the ORIENTATION.

If the argument is a matrix then different directions are used for each firm. The dimensions depends on the ORIENTATION (and TRANSPOSE), the number of firms must correspond to the number of firms in X and Y.

DIRECT must not be used in connection with ORIENTATION="graph".

param

Possible parameters. At the moment only used for RTS="fdh+" to set low and high values for restrictions on lambda; see the section details and examples for its use. Future versions might also use param for other purposes.

TRANSPOSE

Input and output matrices are treated as firms times goods matrices for the default value TRANSPOSE=FALSE corresponding to the standard in R for statistical models. When TRUE data matrices are transposed to good times firms matrices as is normally used in LP formulation of the problem.

LP

Only for debugging. If LP=TRUE then input and output for the LP program are written to standard output for each unit.

FAST

Only calculate efficiencies and just return them as a vector, i.e. no lambda or other output. The return when using FAST cannot be used as input for slack and peers.

CONTROL

Possible controls to lpSolveAPI, see the documentation for that package; use ?lp.control.options

...

Optional parameters for the print and summary methods.

object, x

An object of class Farrell (returned by the function dea) – R code uses ‘object’ and ‘x’ alternating for generic methods.

digits

digits in printed output, handled by format in print.

LPK

when LPK=k then a mps file is written for firm k; it can be used as input to an alternative LP solver to check the results.

Details

The return from dea and sdea is an object of class Farrell. The efficiency in dea is calculated by the LP method in the package lpSolveAPI. Slacks can be calculated either in the call of dea using the option SLACK=TRUE or in a following call to the function slack.

The directional efficiency when the argument DIRECT is used, depends on the unit of measurement and is not restricted to be less than 1 (or greater than 1 for output efficiency) and is therefore completely different from the Farrell efficiency.

The crs factor in RTS="fdh+" that sets the lower and upper bound can be changed by the argument param that will set the lower and upper bound to 1-param and 1+param; the default value is param=.15. The value must be greater than or equal to 0 and strictly less than 1. A value of 0 corresponds to RTS="fdh". To get an asymmetric interval set param to a 2 dimensional array with values for the low and high end for interval, for instance param=c(.8,1.15). The FDH+ technology set is described in Bogetoft and Otto (2011) pages 73–74.

The technology RTS="vrs+" uses the parameter param to set restrictions on lambda, the convexity parameters. The elements of param are param=(low, high, sum_low, sum_high) where "low" and "high" are restrictions on the individual lambda and "sum_low" and "sum_high" are restrictions on the sum of lambdas. The individual lambda must be in the interval from low to high or be zero. With one parameter the restrictions set are (param, 1+1-(param),1,1), with two parameters (param[1], param[2],1,1), and with four parameters (param[1], param[2],param[3], param[4]). The resulting technology set is not necessarily convex.

The graph orientated efficiency is calculated by bisection between feasible and infeasible values of G. The precision in the result is less than for the other orientations.

When the argument DIRECT=d is used then the returned value e for input orientation is the exces input measured in d units of measurements, i.e. x-e d, and for output orientation y+e d. The directional efficency can be restricted to inputs (ORIENTAION="in"), restricted to outputs (ORIENTAION="out"), or both include inputs and output directions (ORIENTAION="in-out"). Directional efficiency is discussed on pages 31–35 and 121–127 in Bogetoft and Otto (2011).

Value

The results are returned in a Farrell object with the following components. The last three components in the list are only part of the object when SLACK=TRUE.

`eff`	The efficiencies. Note when DIRECT is used then the efficencies are not Farrell efficiencies but rather exces values in DIRECT units of measurement
`lambda`	The lambdas, i.e. the weight of the peers, for each firm
`objval`	The objective value as returned from the LP program; normally the same as eff, but for `slack` it is the the sum of the slacks
`RTS`	The return to scale assumption as in the option `RTS` in the call
`ORIENTATION`	The efficiency orientation as in the call
`TRANSPOSE`	As in the call
`slack`	A logical vector where the component for a firm is `TRUE` if the sums of slacks for the corresponding firm is positive. Only calculated in dea when option `SLACK=TRUE`
`sum`	A vector with sums of the slacks for each firm. Only calculated in dea when option `SLACK=TRUE`
`sx`	A matrix for input slacks for each firm, only calculated if the option `SLACK` is `TRUE` or returned from the method `slack`
`sy`	A matrix for output slack, see `sx`
`ux`	Dual variable for input, only calculated if `DUAL` is `TRUE`.
`vy`	Dual variable for output, only calculated if `DUAL` is `TRUE`.

Note

The arguments X, Y, XREF, and YREF are supposed to be matrices or numerical data frames that in the function will be converted to matrices. When subsetting a matrix or data frame to just one column then the class of the resulting object/variable is no longer a matrix or a data frame, but just a numeric (array, vector). Therefore, in this case a numeric input that is not a matrix nor a data frame is transformed to a 1 column matrix, and here the use of the argument TRANSPOSE=TRUE gives an error.

The dual values are not unique for extreme points (firms on the boundary with an efficiency of 1) and therefore the calculated dual values for these firms can depend on the order of firms in the reference technology. The same lack of uniqueness also makes the peers for some firms depend on the order of firms in the reference technology.

To calucalte slack use the argument SLACK=TRUE or use the function slack directly.

When there is slack, and slack is not taken into consideration, then the peers for a firm with slack might depend on the order of firms in the data set; this is a property of the LP algorithm used to solve the problem.

To handle fixed, non-discretionary inputs, one can let it appear as negative output in an input-based mode, and reversely for fixed, non-discretionary outputs. Fixed inputs (outputs) can also be handled by directional efficiency; set the direction, the argument DIRECT, equal to the variable, discretionary inputs (outputs) and 0 for the fixed inputs (outputs).

When the the argument DIRECT=X is used the then the returned effiency is equal to 1 minus the Farrell efficiency for input orientation and to the Farrell effiency minus 1 for output orientation.

To use matrices X and Y prepared for the methods in the package FEAR (Wilson 2008) set the options TRANSPOSE=TRUE; for consistency with FEAR the options RTS and ORIENTATION also accepts numbers as in FEAR.

The tolerance that lambda is zero or one is 1e-7, the default value of 'epsint' in the package lpSolveAPI, i.e. values closer than 1e-7 from zero or one are set to respective integer value. The 'epsint' is the tolerance that is used to determine whether a floating-point number is in fact an in teger. The same tolerance is used for efficiency value near one.

Some scaling is done in the function, but this does not always work satisfactory, i.e. sometime, a solution cannot always be found – the program prints a warning and the efficiency for the firm is set to NA. Often this is due to a bad scaling of the data. Either the user can try a different scaling of data when calling the function or one can use the option CONTROL to try a different scaling by the program. For instance one can insert CONTROL=list(scaling=c("geometric", "equilibrate") or CONTROL=list(scaling=c("curtisreid", "equilibrate", "dynupdate") in the option list for the function call. The full list of possible scaling options can be found found from ?lp.control.options under "scaling".

If a numerical problem occurs, status=5, the best solution is probably to scale the input X and output Y yourself or use a different scaling option as desribed above. The best results are obtained when the variables are close to 1. If some variable are in the millions, then let the unit of measure be a million.

Author(s)

Peter Bogetoft and Lars Otto larsot23@gmail.com

References

Peter Bogetoft and Lars Otto; Benchmarking with DEA, SFA, and R; Springer 2011

Examples

x <- matrix(c(100,200,300,500,100,200,600),ncol=1)
y <- matrix(c(75,100,300,400,25,50,400),ncol=1)
dea.plot.frontier(x,y,txt=TRUE)

e <- dea(x,y)
eff(e)
print(e)
summary(e)
lambda(e)

# Input savings potential for each firm
(1-eff(e)) * x
(1-e$eff) * x

# calculate slacks
el <- dea(x,y,SLACK=TRUE)
data.frame(e$eff,el$eff,el$slack,el$sx,el$sy)

# Fully efficient units, eff==1 and no slack
which(eff(e) == 1 & !el$slack)

# fdh+ with limits in the interval [.7, 1.2]
dea(x,y,RTS="fdh+", param=c(.7,1.2))

[Package Benchmarking version 0.32 Index]