R: Lorenz curve

svylorenz {convey}

R Documentation

Lorenz curve

Description

Estimate the Lorenz curve, an inequality graph

Usage

svylorenz(formula, design, ...)

## S3 method for class 'survey.design'
svylorenz(
  formula,
  design,
  quantiles = seq(0, 1, 0.1),
  empirical = FALSE,
  plot = TRUE,
  add = FALSE,
  curve.col = "red",
  ci = TRUE,
  alpha = 0.05,
  na.rm = FALSE,
  deff = FALSE,
  linearized = FALSE,
  influence = FALSE,
  ...
)

## S3 method for class 'svyrep.design'
svylorenz(
  formula,
  design,
  quantiles = seq(0, 1, 0.1),
  empirical = FALSE,
  plot = TRUE,
  add = FALSE,
  curve.col = "red",
  ci = TRUE,
  alpha = 0.05,
  na.rm = FALSE,
  deff = FALSE,
  linearized = FALSE,
  return.replicates = FALSE,
  ...
)

## S3 method for class 'DBIsvydesign'
svylorenz(formula, design, ...)

Arguments

`formula`	a formula specifying the income variable
`design`	a design object of class `survey.design` or class `svyrep.design` from the `survey` library.
`...`	additional arguments passed to `plot` methods
`quantiles`	a sequence of probabilities that defines the quantiles sum to be calculated
`empirical`	Should an empirical Lorenz curve be estimated as well? Defaults to `FALSE`.
`plot`	Should the Lorenz curve be plotted? Defaults to `TRUE`.
`add`	Should a new curve be plotted on the current graph?
`curve.col`	a string defining the color of the curve.
`ci`	Should the confidence interval be plotted? Defaults to `TRUE`.
`alpha`	a number that especifies de confidence level for the graph.
`na.rm`	Should cases with missing values be dropped? Defaults to `FALSE`.
`deff`	Return the design effect (see `survey::svymean`)
`linearized`	Should a matrix of linearized variables be returned
`influence`	Should a matrix of (weighted) influence functions be returned? (for compatibility with `svyby`)
`return.replicates`	Return the replicate estimates?

Details

you must run the convey_prep function on your survey design object immediately after creating it with the svydesign or svrepdesign function.

Notice that the 'empirical' curve is observation-based and is the one actually used to calculate the Gini index. On the other hand, the quantile-based curve is used to estimate the shares, SEs and confidence intervals.

This way, as the number of quantiles of the quantile-based function increases, the quantile-based curve approacches the observation-based curve.

Value

Object of class "oldsvyquantile", which are vectors with a "quantiles" attribute giving the proportion of income below that quantile, and a "SE" attribute giving the standard errors of the estimates.

Author(s)

Guilherme Jacob, Djalma Pessoa and Anthony Damico

References

Milorad Kovacevic and David Binder (1997). Variance Estimation for Measures of Income Inequality and Polarization - The Estimating Equations Approach. Journal of Official Statistics, Vol.13, No.1, 1997. pp. 41 58. URL https://www.scb.se/contentassets/ca21efb41fee47d293bbee5bf7be7fb3/variance-estimation-for-measures-of-income-inequality-and-polarization—the-estimating-equations-approach.pdf.

Shlomo Yitzhaki and Robert Lerman (1989). Improving the accuracy of estimates of Gini coefficients. Journal of Econometrics, Vol.42(1), pp. 43-47, September.

Matti Langel (2012). Measuring inequality in finite population sampling. PhD thesis. URL http://doc.rero.ch/record/29204.

Examples


library(survey)
library(laeken)
data(eusilc) ; names( eusilc ) <- tolower( names( eusilc ) )

# linearized design
des_eusilc <- svydesign( ids = ~rb030 , strata = ~db040 ,  weights = ~rb050 , data = eusilc )
des_eusilc <- convey_prep( des_eusilc )
svylorenz( ~eqincome , des_eusilc, seq(0,1,.05), alpha = .01 )

# replicate-weighted design
des_eusilc_rep <- as.svrepdesign( des_eusilc , type = "bootstrap" )
des_eusilc_rep <- convey_prep( des_eusilc_rep )

svylorenz( ~eqincome , des_eusilc_rep, seq(0,1,.05), alpha = .01 )

## Not run: 

# linearized design using a variable with missings
svylorenz( ~py010n , des_eusilc, seq(0,1,.05), alpha = .01 )
svylorenz( ~py010n , des_eusilc, seq(0,1,.05), alpha = .01, na.rm = TRUE )
# demonstration of `curve.col=` and `add=` parameters
svylorenz( ~eqincome , des_eusilc, seq(0,1,.05), alpha = .05 , add = TRUE , curve.col = 'green' )
# replicate-weighted design using a variable with missings
svylorenz( ~py010n , des_eusilc_rep, seq(0,1,.05), alpha = .01 )
svylorenz( ~py010n , des_eusilc_rep, seq(0,1,.05), alpha = .01, na.rm = TRUE )



# database-backed design
library(RSQLite)
library(DBI)
dbfile <- tempfile()
conn <- dbConnect( RSQLite::SQLite() , dbfile )
dbWriteTable( conn , 'eusilc' , eusilc )

dbd_eusilc <-
	svydesign(
		ids = ~rb030 ,
		strata = ~db040 ,
		weights = ~rb050 ,
		data="eusilc",
		dbname=dbfile,
		dbtype="SQLite"
	)

dbd_eusilc <- convey_prep( dbd_eusilc )

svylorenz( ~eqincome , dbd_eusilc, seq(0,1,.05), alpha = .01 )

# highlithing the difference between the quantile-based curve and the empirical version:
svylorenz( ~eqincome , dbd_eusilc, seq(0,1,.5), empirical = TRUE, ci = FALSE, curve.col = "green" )
svylorenz( ~eqincome , dbd_eusilc, seq(0,1,.5), alpha = .01, add = TRUE )
legend( "topleft", c("Quantile-based", "Empirical"), lwd = c(1,1), col = c("red", "green"))
# as the number of quantiles increases, the difference between the curves gets smaller
svylorenz( ~eqincome , dbd_eusilc, seq(0,1,.01), empirical = TRUE, ci = FALSE, curve.col = "green" )
svylorenz( ~eqincome , dbd_eusilc, seq(0,1,.01), alpha = .01, add = TRUE )
legend( "topleft", c("Quantile-based", "Empirical"), lwd = c(1,1), col = c("red", "green"))

dbRemoveTable( conn , 'eusilc' )

dbDisconnect( conn , shutdown = TRUE )


## End(Not run)

[Package convey version 1.0.0 Index]