R: Approximate inclusion probabilities by Monte Carlo simulation

jip_MonteCarlo {jipApprox}

R Documentation

Approximate inclusion probabilities by Monte Carlo simulation

Description

Approximate first and second-order inclusion probabilities by means of Monte Carlo simulation. Estimates are obtained as proportion of the number of occurrences of each unit or couple of units over the total number of replications. One unit is added to both numerator and denominator to assure strict positivity of estimates (Fattorini, 2006).

Usage

jip_MonteCarlo(
  x,
  n,
  replications = 1e+06,
  design,
  units,
  seed = NULL,
  as_data_frame = FALSE,
  design_pars,
  write_on_file = FALSE,
  filename,
  path,
  by = NULL,
  progress_bar = TRUE
)

Arguments

`x`	size measure or first-order inclusion probabilities, a vector or single-column data.frame
`n`	sample size (for fixed-size designs), or expected sample size (for Poisson sampling)
`replications`	numeric value, number of independent Monte Carlo replications
`design`	sampling procedure to be used for sample selection. Either a string indicating the name of the sampling design or a function; see section "Details" for more information.
`units`	indices of units for which probabilities have to be estimated. Optional, if missing, estimates are produced for the whole population
`seed`	a valid seed value for reproducibility
`as_data_frame`	logical, should output be in a data.frame form? if FALSE, a matrix is returned
`design_pars`	only used when a function is passed to argument `design`, named list of parameters to pass to the sampling design function.
`write_on_file`	logical, should output be written on a text file?
`filename`	string indicating the name of the file to create on disk, must include the `.txt` extension; only applies if `write_on_file = TRUE`.
`path`	string indicating the path to the directory where the output file should be created; only applies if `write_on_file = TRUE`.
`by`	optional; integer scalar indicating every how many replications a partial output should be saved
`progress_bar`	logical, indicating whether a progress bar is desired

Details

Argument design accepts either a string indicating the sampling design to use to draw samples or a function. Accepted designs are "brewer", "tille", "maxEntropy", "poisson", "sampford", "systematic", "randomSystematic". The user may also pass a function as argument; such function should take as input the parameters passed to argument design_pars and return either a logical vector or a vector of 0s and 1s, where TRUE or 1 indicate sampled units and FALSE or 0 indicate non-sample units. The length of such vector must be equal to the length of x if units is not specified, otherwise it must have the same length of units.

When write_on_file = TRUE, specifying a value for aurgument by will produce intermediate files with approximate inclusion probabilities every by number of replications. E.g., if replications=1e06 and by=5e05, two output files will be created: one with estimates at 5e05 and one at 1e06 replications. This option is particularly useful to assess convergence of the estimates.

Value

A matrix of estimated inclusion probabilities if as_data_frame=FALSE, otherwise a data.frame with three columns: the first two indicate the ids of the the couple of units, while the third one contains the joint-inclusion probability values. Please, note that when as_data_frame=TRUE, first-order inclusion probabilities are not returned.

References

Fattorini, L. 2006. Applying the Horvitz-Thompson criterion in complex designs: A computer-intensive perspective for estimating inclusion probabilities. Biometrika 93 (2), 269–278

Examples

### Generate population data ---
N <- 20; n<-5

set.seed(0)
x <- rgamma(N, scale=10, shape=5)
y <- abs( 2*x + 3.7*sqrt(x) * rnorm(N) )

pik  <- n * x/sum(x)

### Approximate joint-inclusion probabilities
pikl <- jip_MonteCarlo(x=pik, n = n, replications = 100, design = "brewer")
pikl <- jip_MonteCarlo(x=pik, n = n, replications = 100, design = "tille")
pikl <- jip_MonteCarlo(x=pik, n = n, replications = 100, design = "maxEntropy")
pikl <- jip_MonteCarlo(x=pik, n = n, replications = 100, design = "randomSystematic")
pikl <- jip_MonteCarlo(x=pik, n = n, replications = 100, design = "systematic")
pikl <- jip_MonteCarlo(x=pik, n = n, replications = 100, design = "sampford")
pikl <- jip_MonteCarlo(x=pik, n = n, replications = 100, design = "poisson")

#Use an external function to draw samples
pikl <- jip_MonteCarlo(x=pik, n=n, replications=100,
                       design = sampling::UPmidzuno, design_pars = list(pik=pik))
#Write output on file after 50 and 100 replications
pikl <- jip_MonteCarlo(x=pik, n = n, replications = 100, design = "brewer",
                       write_on_file = TRUE, filename="test.txt", path=tempdir(), by = 50 )

[Package jipApprox version 0.1.5 Index]