prnsamplr-package {prnsamplr} | R Documentation |
Permanent Random Number Sampling
Description
Survey sampling using permanent random numbers (PRN's). A solution to the problem of unknown overlap between survey samples, which leads to a low precision in estimates when the survey is repeated or combined with other surveys. The PRN solution is to supply the U(0, 1) random numbers to the sampling procedure, instead of having the sampling procedure generate them. In Lindblom (2014) <doi:10.2478/jos-2014-0047>, and therein cited articles, it is shown how this is carried out and how it improves the estimates. This package supports two common fixed-size sampling procedures (simple random sampling and probability-proportional-to-size sampling) and includes a function for transforming the PRN's in order to control the sample overlap.
Details
This package provides two functions for drawing stratified PRN-assisted samples: srs
and pps
. The former – simple random sampling – assumes that each unit k
in a given stratum h
is equally likely to be sampled, with inclusion probability
\pi_k = \frac{n_h}{N_h}
for each stratum h
. The function then samples the n_h
elements with the smallest PRN's, for each stratum h
.
The latter – Pareto \pi ps
sampling – assumes that large units are more likely to be sampled than small units. The function approximates this unknown inclusion probability as
\lambda_k = n_h \frac{x_k}{\sum_{i=1}^{n_h} x_i},
where x_k
is a size measure, and samples the n_h
elements with the smallest values of
Q_k = \frac{PRN_k(1 - \lambda_k)}{\lambda_k(1 - PRN_k)},
for each stratum h
.
These two functions can be run standalone or via the wrapper function samp
. Input to the functions is the sampling frame, stratification information and PRN's given as variables on the frame, and in the case for pps
also a size measure given as variable on the frame. Output is a copy of the sampling frame containing sampling information, and in the case for pps
also containing \lambda
and Q
.
Provided is also a function transformprn
via which it is possible to select where to start counting and in which direction when enumerating the PRN's in the sampling routines. This is done by specifying start and direction to transformprn
and then calling srs
or pps
on its output.
Finally, an example dataset is provided that can be used to illustrate the functionality of the package.
Author(s)
Kira Coder Gylling
Maintainer: Kira Coder Gylling <kira.gylling@gmail.com>
References
Lindblom, A. (2014). "On Precision in Estimates of Change over Time where Samples are Positively Coordinated by Permanent Random Numbers." Journal of Official Statistics, vol.30, no.4, 2014, pp.773-785. https://doi.org/10.2478/jos-2014-0047.
See Also
srs
,
pps
,
samp
,
transformprn
,
ExampleData
.
Examples
dfSRS <- srs(df=ExampleData,
nsamp="nsample",
stratid="stratum",
prn="rands")
dfPPS <- pps(df=ExampleData,
nsamp="nsample",
stratid="stratum",
prn="rands",
size="sizeM")
dfPRN <- transformprn(df=ExampleData,
prn="rands",
direction="U",
start=0.2)