R: Data-Fitting Function for the Rotated and Right-Shifted...

fitLorenz {biogeom}

R Documentation

Data-Fitting Function for the Rotated and Right-Shifted Lorenz Curve

Description

fitLorenz is used to estimate the parameters of the rotated and right-shifted Lorenz curve using version 4 or 5 of MPerformanceE, or the Lorenz equations including SarabiaE, SCSE, and SHE.

Usage

fitLorenz(expr, z, ini.val, simpver = 4, 
          control = list(), par.list = FALSE, 
          fig.opt = FALSE, np = 2000, 
          xlab=NULL, ylab=NULL, main = NULL, subdivisions = 100L,
          rel.tol = .Machine$double.eps^0.25, 
          abs.tol = rel.tol, stop.on.error = TRUE, 
          keep.xy = FALSE, aux = NULL, par.limit = TRUE)

Arguments

`expr`	version 4 or 5 of `MPerformanceE`, or the Lorenz equations including `SarabiaE`, `SCSE`, and `SHE`.
`z`	the observations of size distribution (i.e., the household income distribution, the leaf size distribution).
`ini.val`	the initial values of the model parameters.
`simpver`	an optional argument to use version 4 or 5 of `MPerformanceE`.
`control`	the list of control parameters for using the `optim` function in package stats.
`par.list`	the option of showing the list of parameters on the screen.
`fig.opt`	an optional argument to draw the original and rotated Lorenz curves.
`np`	the number of data points to draw the predicted original and rotated Lorenz curves.
`xlab`	the label of the `x`-axis when showing the original Lorenz curve.
`ylab`	the label of the `y`-axis when showing the original Lorenz curve.
`main`	the main title of the figure.
`subdivisions`	please see the arguments for the `integrate` function in package stats.
`rel.tol`	please see the arguments for the `integrate` function in package stats.
`abs.tol`	please see the arguments for the `integrate` function in package stats.
`stop.on.error`	please see the arguments for the `integrate` function in package stats.
`keep.xy`	please see the arguments for the `integrate` function in package stats.
`aux`	please see the arguments for the `integrate` function in package stats.
`par.limit`	an optional argument to limit the numerical ranges of model parameters of the three Lorenz equations including `SarabiaE`, `SCSE`, and `SHE`.

Details

Here, ini.val only includes the initial values of the model parameters as a list. The Nelder-Mead algorithm (Nelder and Mead, 1965) is used to carry out the optimization of minimizing the residual sum of squares (RSS) between the observed and predicted y values. The optim function in package stats was used to carry out the Nelder-Mead algorithm. Here, versions 4 and 5 of MPerformanceE and the Lorenz equations including SarabiaE, SCSE, and SHE can be used to fit the rotated and right-shifted Lorenz curve.

\quad When simpver = 4, the simplified version 4 of MPerformanceE is selected:

\mbox{if } x \in{\left(0, \ \sqrt{2}\right)},

y = c\left(1-e^{-K_{1}x}\right)^{a}\left(1-e^{K_{2}\left(x-\sqrt{2}\right)}\right)^{b};

\mbox{if } x \notin{\left(0, \ \sqrt{2}\right)},

y = 0.

There are five elements in P, representing the values of c, K_{1}, K_{2}, a, and b, respectively.

\quad When simpver = 5, the simplified version 5 of MPerformanceE is selected:

\mbox{if } x \in{\left(0, \ \sqrt{2}\right)},

y = c\left(1-e^{-K_{1}x}\right)\left(1-e^{K_{2}\left(x-\sqrt{2}\right)}\right);

\mbox{if } x \notin{\left(0, \ \sqrt{2}\right)},

y = 0.

There are three elements in P, representing the values of c, K_{1}, and K_{2}, respectively.

\quad For the Lorenz functions, the user can define any formulae that follow the below form: Lorenz.fun <- function(P, x){...}, where P is the vector of parameter(s), x is the preditor that ranges between 0 and 1 representing the cumulative proportion of the number of individuals in a statistical unit, and Lorenz.fun is the name of a Lorenz function defined by the user, which also ranges between 0 and 1 representing the cumulative proportion of the income or size in a statistical unit. This package provides three representative Lorenz functions: SarabiaE, SCSE, and SHE.

\quad Here, the Gini coefficient (GC) is calculated as follows when MPerformanceE is selected:

\mbox{GC} = 2\int_{0}^{\sqrt{2}}y\,dx,

where x and y are the independent and dependent variables in version 4 or 5 of MPerformanceE, respectively.

\quad However, the Gini coefficient (GC) is calculated as follows when a Lorenz function, e.g., SCSE, is selected:

\mbox{GC} = 2\int_{0}^{1}y\,dx,

where x and y are the independent and dependent variables in the Lorenz function, respectively.

\quad For SarabiaE and SHE, there are explicit formulae for GC (Sarabia, 1997; Sitthiyot and Holasut, 2023).

Value

`x1`	the cumulative proportion of the number of an entity of interest, i.e., the number of households of a city, the number of leaves of a plant.
`y1`	the cumulative proportion of the size of an entity of interest.
`x`	the `x` coordinates of the rotated and right-shifted `y1` versus `x1`.
`y`	the `y` coordinates of the rotated and right-shifted `y1` versus `x1`.
`par`	the estimates of the model parameters.
`r.sq`	the coefficient of determination between the observed and predicted `y` values.
`RSS`	the residual sum of squares between the observed and predicted `y` values.
`sample.size`	the number of data points used in the data fitting.
`GC`	the calculated Gini coefficient.

Note

When MPerformanceE is selected, the estimates of the model parameters denote those in MPerformanceE rather than being obtained by directly fitting the y1 versus x1 data; when a Lorenz function is selected, the estimates of the model parameters denote those in the Lorenz function.

Author(s)

Peijian Shi pjshi@njfu.edu.cn, Johan Gielis johan.gielis@uantwerpen.be, Brady K. Quinn Brady.Quinn@dfo-mpo.gc.ca.

References

Huey, R.B., Stevenson, R.D. (1979) Integrating thermal physiology and ecology of ectotherms: a discussion of approaches. American Zoologist 19, 357-366. doi:10.1093/icb/19.1.357

Lian, M., Shi, P., Zhang, L., Yao, W., Gielis, J., Niklas, K.J. (2023) A generalized performance equation and its application in measuring the Gini index of leaf size inequality. Trees - Structure and Function 37, 1555-1565. doi:10.1007/s00468-023-02448-8

Lorenz, M.O. (1905) Methods of measuring the concentration of wealth. Journal of the American Statistical Association 9(70), 209-219. doi:10.2307/2276207

Nelder, J.A., Mead, R. (1965) A simplex method for function minimization. Computer Journal 7, 308-313. doi:10.1093/comjnl/7.4.308

Sarabia, J.-M. (1997) A hierarchy of Lorenz curves based on the generalized Tukey's lambda distribution. Econometric Reviews 16, 305-320. doi:10.1080/07474939708800389

Shi, P., Gielis, J., Quinn, B.K., Niklas, K.J., Ratkowsky, D.A., Schrader, J., Ruan, H., Wang, L., Niinemets, Ü. (2022) 'biogeom': An R package for simulating and fitting natural shapes. Annals of the New York Academy of Sciences 1516, 123-134. doi:10.1111/nyas.14862

Sitthiyot, T., Holasut, K. (2023) A universal model for the Lorenz curve with novel applications for datasets containing zeros and/or exhibiting extreme inequality. Scientific Reports 13, 4729. doi:10.1038/s41598-023-31827-x

Examples

  data(LeafSizeDist)

  CulmNumber <- c()
  for(i in 1:length(LeafSizeDist$Code)){
    temp <- as.numeric( strsplit(LeafSizeDist$Code[i], "-", fixed=TRUE)[[1]][1] )
    CulmNumber <- c(CulmNumber, temp)
  }
  uni.CN <- sort( unique(CulmNumber) )  
  ind    <- CulmNumber==uni.CN[1]
  A0     <- LeafSizeDist$A[ind]

  ini.val1 <- list(0.5, 0.1, c(0.01, 0.1, 1, 5, 10), 1, 1)
  ini.val2 <- list(0.5, 0.1, c(0.01, 0.1, 1, 5, 10))
  resu1 <- fitLorenz(MPerformanceE, z=A0, ini.val=ini.val1, simpver=4, fig.opt=TRUE)
  resu2 <- fitLorenz(MPerformanceE, z=A0, ini.val=ini.val2, simpver=5, fig.opt=TRUE)
  resu1$par
  resu2$par

  ini.val3 <- list(0.9, c(10, 50, 100, 500), 1, 0)   
  resu3 <- fitLorenz( SarabiaE, z=A0, ini.val=ini.val3, par.limit=TRUE, 
                      fig.opt=TRUE, control=list(reltol=1e-20, maxit=10000) )
  resu3$par
  resu3$GC

  graphics.off()

[Package biogeom version 1.4.3 Index]