| jip_approx {jipApprox} | R Documentation |
Approximate Joint-Inclusion Probabilities
Description
Approximations of joint-inclusion probabilities by means of first-order inclusion probabilities.
Usage
jip_approx(pik, method)
Arguments
pik |
numeric vector of first-order inclusion probabilities for all population units. |
method |
string representing one of the available approximation methods. |
Details
Available methods are "Hajek", "HartleyRao", "Tille",
"Brewer1","Brewer2","Brewer3", and "Brewer4".
Note that these methods were derived for high-entropy sampling designs,
therefore they could have low performance under different designs.
Hájek (1964) approximation [method="Hajek"] is derived under Maximum Entropy sampling design
and is given by
\tilde{\pi}_{ij} = \pi_i\pi_j \frac{1 - (1-\pi_i)(1-\pi_j)}{d}
where d = \sum_{i\in U} \pi_i(1-\pi_i)
Hartley and Rao (1962) proposed the following approximation under
randomised systematic sampling [method="HartleyRao"]:
\tilde{\pi}_{ij} = \frac{n-1}{n} \pi_i\pi_j + \frac{n-1}{n^2} (\pi_i^2 \pi_j + \pi_i \pi_j^2)
- \frac{n-1}{n^3}\pi_i\pi_j \sum_{i\in U} \pi_j^2
+ \frac{2(n-1)}{n^3} (\pi_i^3 \pi_j + \pi_i\pi_j^3 + \pi_i^2 \pi_j^2)
- \frac{3(n-1)}{n^4} (\pi_i^2 \pi_j + \pi_i\pi_j^2) \sum_{i \in U}\pi_i^2
+ \frac{3(n-1)}{n^5} \pi_i\pi_j \biggl( \sum_{i\in U} \pi_i^2 \biggr)^2
- \frac{2(n-1)}{n^4} \pi_i\pi_j \sum_{i \in U} \pi_j^3
Tillé (1996) proposed the approximation \tilde{\pi}_{ij} = \beta_i\beta_j,
where the coefficients \beta_i are computed iteratively through the
following procedure [method="Tille"]:
-
\beta_i^{(0)} = \pi_i, \,\, \forall i\in U -
\beta_i^{(2k-1)} = \frac{(n-1)\pi_i}{\beta^{(2k-2)} - \beta_i^{(2k-2)}} -
\beta_i^{2k} = \beta_i^{(2k-1)} \Biggl( \frac{n(n-1)}{(\beta^(2k-1))^2 - \sum_{i\in U} (\beta_k^{(2k-1)})^2 } \Biggr)^(1/2)
with \beta^{(k)} = \sum_{i\in U} \beta_i^{i}, \,\, k=1,2,3, \dots
Finally, Brewer (2002) and Brewer and Donadio (2003) proposed four approximations, which are defined by the general form
\tilde{\pi}_{ij} = \pi_i\pi_j (c_i + c_j)/2
where the c_i determine the approximation used:
Equation (9) [
method="Brewer1"]:c_i = (n-1) / (n-\pi_i)Equation (10) [
method="Brewer2"]:c_i = (n-1) / \Bigl(n- n^{-1}\sum_{i\in U}\pi_i^2 \Bigr)Equation (11) [
method="Brewer3"]:c_i = (n-1) / \Bigl(n - 2\pi_i + n^{-1}\sum_{i\in U}\pi_i^2 \Bigr)Equation (18) [
method="Brewer4"]:c_i = (n-1) / \Bigl(n - (2n-1)(n-1)^{-1}\pi_i + (n-1)^{-1}\sum_{i\in U}\pi_i^2 \Bigr)
Value
A symmetric matrix of inclusion probabilities, which diagonal is the vector of first-order inclusion probabilities.
References
Hartley, H.O.; Rao, J.N.K., 1962. Sampling With Unequal Probability and Without Replacement. The Annals of Mathematical Statistics 33 (2), 350-374.
Hájek, J., 1964. Asymptotic Theory of Rejective Sampling with Varying Probabilities from a Finite Population. The Annals of Mathematical Statistics 35 (4), 1491-1523.
Tillé, Y., 1996. Some Remarks on Unequal Probability Sampling Designs Without Replacement. Annals of Economics and Statistics 44, 177-189.
Brewer, K.R.W.; Donadio, M.E., 2003. The High Entropy Variance of the Horvitz-Thompson Estimator. Survey Methodology 29 (2), 189-196.
Examples
### Generate population data ---
N <- 20; n<-5
set.seed(0)
x <- rgamma(N, scale=10, shape=5)
y <- abs( 2*x + 3.7*sqrt(x) * rnorm(N) )
pik <- n * x/sum(x)
### Approximate joint-inclusion probabilities ---
pikl <- jip_approx(pik, method='Hajek')
pikl <- jip_approx(pik, method='HartleyRao')
pikl <- jip_approx(pik, method='Tille')
pikl <- jip_approx(pik, method='Brewer1')
pikl <- jip_approx(pik, method='Brewer2')
pikl <- jip_approx(pik, method='Brewer3')
pikl <- jip_approx(pik, method='Brewer4')