jip_approx {jipApprox} | R Documentation |
Approximate Joint-Inclusion Probabilities
Description
Approximations of joint-inclusion probabilities by means of first-order inclusion probabilities.
Usage
jip_approx(pik, method)
Arguments
pik |
numeric vector of first-order inclusion probabilities for all population units. |
method |
string representing one of the available approximation methods. |
Details
Available methods are "Hajek"
, "HartleyRao"
, "Tille"
,
"Brewer1"
,"Brewer2"
,"Brewer3"
, and "Brewer4"
.
Note that these methods were derived for high-entropy sampling designs,
therefore they could have low performance under different designs.
Hájek (1964) approximation [method="Hajek"
] is derived under Maximum Entropy sampling design
and is given by
\tilde{\pi}_{ij} = \pi_i\pi_j \frac{1 - (1-\pi_i)(1-\pi_j)}{d}
where d = \sum_{i\in U} \pi_i(1-\pi_i)
Hartley and Rao (1962) proposed the following approximation under
randomised systematic sampling [method="HartleyRao"
]:
\tilde{\pi}_{ij} = \frac{n-1}{n} \pi_i\pi_j + \frac{n-1}{n^2} (\pi_i^2 \pi_j + \pi_i \pi_j^2)
- \frac{n-1}{n^3}\pi_i\pi_j \sum_{i\in U} \pi_j^2
+ \frac{2(n-1)}{n^3} (\pi_i^3 \pi_j + \pi_i\pi_j^3 + \pi_i^2 \pi_j^2)
- \frac{3(n-1)}{n^4} (\pi_i^2 \pi_j + \pi_i\pi_j^2) \sum_{i \in U}\pi_i^2
+ \frac{3(n-1)}{n^5} \pi_i\pi_j \biggl( \sum_{i\in U} \pi_i^2 \biggr)^2
- \frac{2(n-1)}{n^4} \pi_i\pi_j \sum_{i \in U} \pi_j^3
Tillé (1996) proposed the approximation \tilde{\pi}_{ij} = \beta_i\beta_j
,
where the coefficients \beta_i
are computed iteratively through the
following procedure [method="Tille"
]:
-
\beta_i^{(0)} = \pi_i, \,\, \forall i\in U
-
\beta_i^{(2k-1)} = \frac{(n-1)\pi_i}{\beta^{(2k-2)} - \beta_i^{(2k-2)}}
-
\beta_i^{2k} = \beta_i^{(2k-1)} \Biggl( \frac{n(n-1)}{(\beta^(2k-1))^2 - \sum_{i\in U} (\beta_k^{(2k-1)})^2 } \Biggr)^(1/2)
with \beta^{(k)} = \sum_{i\in U} \beta_i^{i}, \,\, k=1,2,3, \dots
Finally, Brewer (2002) and Brewer and Donadio (2003) proposed four approximations, which are defined by the general form
\tilde{\pi}_{ij} = \pi_i\pi_j (c_i + c_j)/2
where the c_i
determine the approximation used:
Equation (9) [
method="Brewer1"
]:c_i = (n-1) / (n-\pi_i)
Equation (10) [
method="Brewer2"
]:c_i = (n-1) / \Bigl(n- n^{-1}\sum_{i\in U}\pi_i^2 \Bigr)
Equation (11) [
method="Brewer3"
]:c_i = (n-1) / \Bigl(n - 2\pi_i + n^{-1}\sum_{i\in U}\pi_i^2 \Bigr)
Equation (18) [
method="Brewer4"
]:c_i = (n-1) / \Bigl(n - (2n-1)(n-1)^{-1}\pi_i + (n-1)^{-1}\sum_{i\in U}\pi_i^2 \Bigr)
Value
A symmetric matrix of inclusion probabilities, which diagonal is the vector of first-order inclusion probabilities.
References
Hartley, H.O.; Rao, J.N.K., 1962. Sampling With Unequal Probability and Without Replacement. The Annals of Mathematical Statistics 33 (2), 350-374.
Hájek, J., 1964. Asymptotic Theory of Rejective Sampling with Varying Probabilities from a Finite Population. The Annals of Mathematical Statistics 35 (4), 1491-1523.
Tillé, Y., 1996. Some Remarks on Unequal Probability Sampling Designs Without Replacement. Annals of Economics and Statistics 44, 177-189.
Brewer, K.R.W.; Donadio, M.E., 2003. The High Entropy Variance of the Horvitz-Thompson Estimator. Survey Methodology 29 (2), 189-196.
Examples
### Generate population data ---
N <- 20; n<-5
set.seed(0)
x <- rgamma(N, scale=10, shape=5)
y <- abs( 2*x + 3.7*sqrt(x) * rnorm(N) )
pik <- n * x/sum(x)
### Approximate joint-inclusion probabilities ---
pikl <- jip_approx(pik, method='Hajek')
pikl <- jip_approx(pik, method='HartleyRao')
pikl <- jip_approx(pik, method='Tille')
pikl <- jip_approx(pik, method='Brewer1')
pikl <- jip_approx(pik, method='Brewer2')
pikl <- jip_approx(pik, method='Brewer3')
pikl <- jip_approx(pik, method='Brewer4')