R: Simulate data to mimic 'county_bins' and 'county

simcounty {binsmooth}

R Documentation

Simulate data to mimic `county_bins` and `county_true`

Description

Samples from a selection of distributions (Gamma, Lognormal, Weibull, Triangle) to simulate income data in the format used in the American Community Survey data (county_bins and county_true).

Usage

simcounty(numCounties, minPop = 1000, maxPop = 100000,
          bin_minimums = c(0, 10000, 15000, 20000, 25000, 30000, 35000, 40000, 45000,
                           50000, 60000, 75000, 100000, 125000, 150000, 200000))

Arguments

`numCounties`	The number of counties to simulate data for
`minPop`	Minimum population to sample (default = 1000)
`maxPop`	Maximum population to sample (default = 100000)
`bin_minimums`	Bin edges. Defaults to the edges used in the Census data.

Details

The county names will tell which distributions were sampled to simulate each county.

Value

Returns a list of two data frames:

`county_bins`	Simulated binned income data
`county_true`	Statistics computed from the raw data

Author(s)

David J. Hunter and McKalie Drown

References

Paul T. von Hippel, David J. Hunter, McKalie Drown. Better Estimates from Binned Income Data: Interpolated CDFs and Mean-Matching, Sociological Science, November 15, 2017. https://www.sociologicalscience.com/articles-v4-26-641/

Examples

l1 <- simcounty(5)
cb <- l1$county_bins
ct <- l1$county_true
sbl <- splinebins(cb$bin_max[cb$fips==103], cb$households[cb$fips==103],
                  ct$mean_true[ct$fips==103])
stl <- stepbins(cb$bin_max[cb$fips==105], cb$households[cb$fips==105],
                ct$mean_true[ct$fips==105])
plot(sbl$splinePDF, 0, 300000, n=500)
plot(stl$stepPDF, do.points=FALSE, main=cb$county[cb$fips==105][1])

## Simulate one county and estimate gini and theil from binned data
l2 <- simcounty(1)
binedges <- l2$county_bins$bin_max + 0.5 # continuity correction
bincounts <- l2$county_bins$households
splinefit <- splinebins(binedges, bincounts, l2$county_true$mean_true)
gini(splinefit)
theil(splinefit)
l2$county_true

[Package binsmooth version 0.2.2 Index]

Simulate data to mimic county_bins and county_true