prnsamplr-package {prnsamplr}R Documentation

Permanent Random Number Sampling

Description

Survey sampling using permanent random numbers (PRN's). A solution to the problem of unknown overlap between survey samples, which leads to a low precision in estimates when the survey is repeated or combined with other surveys. The PRN solution is to supply the U(0, 1) random numbers to the sampling procedure, instead of having the sampling procedure generate them. In Lindblom (2014) <doi:10.2478/jos-2014-0047>, and therein cited articles, it is shown how this is carried out and how it improves the estimates. This package supports two common fixed-size sampling procedures (simple random sampling and probability-proportional-to-size sampling) and includes a function for transforming the PRN's in order to control the sample overlap.

Details

This package provides two functions for drawing stratified PRN-assisted samples: srs and pps. The former – simple random sampling – assumes that each unit kk in a given stratum hh is equally likely to be sampled, with inclusion probability

πk=nhNh\pi_k = \frac{n_h}{N_h}

for each stratum hh. The function then samples the nhn_h elements with the smallest PRN's, for each stratum hh.

The latter – Pareto πps\pi ps sampling – assumes that large units are more likely to be sampled than small units. The function approximates this unknown inclusion probability as

λk=nhxki=1nhxi,\lambda_k = n_h \frac{x_k}{\sum_{i=1}^{n_h} x_i},

where xkx_k is a size measure, and samples the nhn_h elements with the smallest values of

Qk=PRNk(1λk)λk(1PRNk),Q_k = \frac{PRN_k(1 - \lambda_k)}{\lambda_k(1 - PRN_k)},

for each stratum hh.

These two functions can be run standalone or via the wrapper function samp. Input to the functions is the sampling frame, stratification information and PRN's given as variables on the frame, and in the case for pps also a size measure given as variable on the frame. Output is a copy of the sampling frame containing sampling information, and in the case for pps also containing λ\lambda and QQ.

Provided is also a function transformprn via which it is possible to select where to start counting and in which direction when enumerating the PRN's in the sampling routines. This is done by specifying start and direction to transformprn and then calling srs or pps on its output.

Finally, an example dataset is provided that can be used to illustrate the functionality of the package.

Author(s)

Kira Coder Gylling

Maintainer: Kira Coder Gylling <kira.gylling@gmail.com>

References

Lindblom, A. (2014). "On Precision in Estimates of Change over Time where Samples are Positively Coordinated by Permanent Random Numbers." Journal of Official Statistics, vol.30, no.4, 2014, pp.773-785. https://doi.org/10.2478/jos-2014-0047.

See Also

srs, pps, samp, transformprn, ExampleData.

Examples

dfSRS <- srs(df=ExampleData, 
             nsamp="nsample", 
             stratid="stratum", 
             prn="rands")

dfPPS <- pps(df=ExampleData, 
             nsamp="nsample", 
             stratid="stratum", 
             prn="rands", 
             size="sizeM")

dfPRN <- transformprn(df=ExampleData, 
                      prn="rands", 
                      direction="U", 
                      start=0.2)

[Package prnsamplr version 0.3.0 Index]