| approx_var_est {UPSvarApprox} | R Documentation |
Approximated Variance Estimators
Description
Approximated variance estimators which use only first-order inclusion probabilities
Usage
approx_var_est(y, pik, method, sample = NULL, ...)
Arguments
y |
numeric vector of sample observations |
pik |
numeric vector of first-order inclusion probabilities of length N, the population size, or n, the sample size depending on the chosen method (see Details for more information) |
method |
string indicating the desired approximate variance estimator. One of "Deville1", "Deville2", "Deville3", "Hajek", "Rosen", "FixedPoint", "Brewer1", "HartleyRao", "Berger", "Tille", "MateiTille1", "MateiTille2", "MateiTille3", "MateiTille4", "MateiTille5", "Brewer2", "Brewer3", "Brewer4". |
sample |
Either a numeric vector of length equal to the sample size with
the indices of sample units, or a boolean vector of the same length of |
... |
two optional parameters can be modified to control the iterative
procedures in methods |
Details
The choice of the estimator to be used is made through the argument method,
the list of methods and their respective equations is presented below.
Matei and Tillé (2005) divides the approximated variance estimators into three classes, depending on the quantities they require:
First and second-order inclusion probabilities: The first class is composed of the Horvitz-Thompson estimator (Horvitz and Thompson 1952) and the Sen-Yates-Grundy estimator (Yates and Grundy 1953; Sen 1953), which are available through function
varHTin packagesampling;Only first-order inclusion probabilities and only for sample units;
Only first-order inclusion probabilities, for the entire population.
Haziza, Mecatti and Rao (2008) provide a common form to express most of the estimators in class 2 and 3:
\widehat{var}(\hat{t}_{HT}) = \sum_{i \in s}c_i e_i^2
where e_i = \frac{y_i}{\pi_i} - \hat{B} , with
\hat{B} = \frac{\sum_{i\in s} a_i (y_i/\pi_i) }{\sum_{i\in s} a_i}
and a_i and c_i are parameters that define the different
estimators:
-
method="Hajek"[Class 2]c_i = \frac{n}{n-1}(1-\pi_i) ; \quad a_i= c_i -
method="Deville2"[Class 2]c_i = (1-\pi_i)\Biggl\{ 1 - \sum_{j\in s}\Bigl[ \frac{1-\pi_j}{\sum_{k\in s} (1-\pi_k)} \Bigr]^2 \Biggr\}^{-1} ; \quad a_i= c_i -
method="Deville3"[Class 2]c_i = (1-\pi_i)\Biggl\{ 1 - \sum_{j\in s}\Bigl[ \frac{1-\pi_j}{\sum_{k\in s} (1-\pi_k)} \Bigr]^2 \Biggr\}^{-1}; \quad a_i= 1 -
method="Rosen"[Class 2]c_i = \frac{n}{n-1} (1-\pi_i); \quad a_i= (1-\pi_i)log(1-\pi_i) / \pi_i -
method="Brewer1"[Class 2]c_i = \frac{n}{n-1}(1-\pi_i); \quad a_i= 1 -
method="Brewer2"[Class 3]c_i = \frac{n}{n-1} \Bigl(1-\pi_i+ \frac{\pi_i}{n} - n^{-2}\sum_{j \in U} \pi_j^2 \Bigr) ; \quad a_i=1 -
method="Brewer3"[Class 3]c_i = \frac{n}{n-1} \Bigl(1-\pi_i - \frac{\pi_i}{n} - n^{-2}\sum_{j \in U} \pi_j^2 \Bigr); \quad a_i = 1 -
method="Brewer4"[Class 3]c_i = \frac{n}{n-1} \Bigl(1-\pi_i - \frac{\pi_i}{n-1} + n^{-1}(n-1)^{-1}\sum_{j \in U} \pi_j^2 \Bigr); \quad a_i=1 -
method="Berger"[Class 3]c_i = \frac{n}{n-1} (1-\pi_i) \Biggl[ \frac{\sum_{j\in s} (1-\pi_j)}{\sum_{j\in U} (1-\pi_j)} \Biggr] ; \quad a_i=c_i -
method="HartleyRao"[Class 3]c_i = \frac{n}{n-1} \Bigl(1-\pi_i - n^{-1}\sum_{j \in s}\pi_i + n^{-1}\sum_{j\in U} \pi_j^2 \Bigr) ; \quad a_i=1
Some additional estimators are defined in Matei and Tillé (2005):
-
method="Deville1"[Class 2]\widehat{var}(\hat{t}_{HT}) = \sum_{i \in s} \frac{c_i}{ \pi_i^2} (y_i - y_i^*)^2where
y_i^* = \pi_i \frac{\sum_{j \in s} c_j y_j / \pi_j}{\sum_{j \in s} c_j}and
c_i = (1-\pi_i)\frac{n}{n-1} -
method="Tille"[Class 3]\widehat{var}(\hat{t}_{HT}) = \biggl( \sum_{i \in s} \omega_i \biggr) \sum_{i\in s} \omega_i (\tilde{y}_i - \bar{\tilde{y}}_\omega )^2 - n \sum_{i\in s}\biggl( \tilde{y}_i - \frac{\hat{t}_{HT}}{n} \biggr)^2where
\tilde{y}_i = y_i / \pi_i,\omega_i = \pi_i / \beta_iand\bar{\tilde{y}}_\omega = \biggl( \sum_{i \in s} \omega_i \biggr)^{-1} \sum_{i \in s} \omega_i \tilde{y}_iThe coefficients
\beta_iare computed iteratively through the following procedure:-
\beta_i^{(0)} = \pi_i, \,\, \forall i\in U -
\beta_i^{(2k-1)} = \frac{(n-1)\pi_i}{\beta^{(2k-2)} - \beta_i^{(2k-2)}} -
\beta_i^{2k} = \beta_i^{(2k-1)} \Biggl( \frac{n(n-1)}{(\beta^(2k-1))^2 - \sum_{i\in U} (\beta_k^{(2k-1)})^2 } \Biggr)^{(1/2)}
with
\beta^{(k)} = \sum_{i\in U} \beta_i^{i}, \,\, k=1,2,3, \dots -
-
method="MateiTille1"[Class 3]\widehat{var}(\hat{t}_{HT}) = \frac{n(N-1)}{N(n-1)} \sum_{i\in s} \frac{b_i}{\pi_i^3} (y_i - \hat{y}_i^*)^2where
\hat{y}_i^* = \pi_i \frac{\sum_{i\in s} b_i y_i/\pi_i^2}{\sum_{i\in s} b_i/\pi_i}and the coefficients
b_iare computed iteratively by the algorithm:-
b_i^{(0)} = \pi_i (1-\pi_i) \frac{N}{N-1}, \,\, \forall i \in U -
b_i^{(k)} = \frac{(b_i^{(k-1)})^2 }{\sum_{j\in U} b_j^{(k-1)} } + \pi_i(1-\pi_i)
a necessary condition for convergence is checked and, if not satisfied, the function returns an alternative solution that uses only one iteration:
b_i = \pi_i(1-\pi_i)\Biggl( \frac{N\pi_i(1-\pi_i)}{ (N-1)\sum_{j\in U}\pi_j(1-\pi_j) } + 1 \Biggr) -
-
method="MateiTille2"[Class 3]\widehat{var}(\hat{t}_{HT}) = \frac{1}{1 - \sum_{i\in U} \frac{d_i^2}{\pi_i} } \sum_{i\in s} (1-\pi_i) \Biggl( \frac{y_i}{\pi_i} - \frac{\hat{t}_{HT}}{n} \Biggr)^2where
d_i = \frac{\pi_i(1-\pi_i)}{\sum_{j\in U} \pi_j(1-\pi_j) } -
method="MateiTille3"[Class 3]\widehat{var}(\hat{t}_{HT}) = \frac{1}{1 - \sum_{i\in U} \frac{d_i^2}{\pi_i} } \sum_{i\in s} (1-\pi_i) \Biggl( \frac{y_i}{\pi_i} - \frac{ \sum_{j\in s} (1-\pi_j)\frac{y_j}{\pi_j} }{ \sum_{j\in s} (1-\pi_j) } \Biggr)^2where
d_iis defined as inmethod="MateiTille2". -
method="MateiTille4"[Class 3]\widehat{var}(\hat{t}_{HT}) = \frac{1}{1 - \sum_{i\in U} b_i/n^2 } \sum_{i\in s} \frac{b_i}{\pi_i^3} (y_i - y_i^* )^2where
y_i^* = \pi_i \frac{ \sum_{j\in s} b_j y_j/\pi_j^2 }{ \sum_{j\in s} b_j/\pi_j }and
b_i = \frac{ \pi_i(1-\pi_i)N }{ N-1 } -
method="MateiTille5"[Class 3] This estimator is defined as inmethod="MateiTille4", and theb_ivalues are defined as inmethod="MateiTille1"
Value
a scalar, the estimated variance
References
Matei, A.; Tillé, Y., 2005. Evaluation of variance approximations and estimators in maximum entropy sampling with unequal probability and fixed sample size. Journal of Official Statistics 21 (4), 543-570.
Haziza, D.; Mecatti, F.; Rao, J.N.K. 2008. Evaluation of some approximate variance estimators under the Rao-Sampford unequal probability sampling design. Metron LXVI (1), 91-108.
Examples
### Generate population data ---
N <- 500; n <- 50
set.seed(0)
x <- rgamma(500, scale=10, shape=5)
y <- abs( 2*x + 3.7*sqrt(x) * rnorm(N) )
pik <- n * x/sum(x)
s <- sample(N, n)
ys <- y[s]
piks <- pik[s]
### Estimators of class 2 ---
approx_var_est(ys, piks, method="Deville1")
approx_var_est(ys, piks, method="Deville2")
approx_var_est(ys, piks, method="Deville3")
approx_var_est(ys, piks, method="Hajek")
approx_var_est(ys, piks, method="Rosen")
approx_var_est(ys, piks, method="FixedPoint")
approx_var_est(ys, piks, method="Brewer1")
### Estimators of class 3 ---
approx_var_est(ys, pik, method="HartleyRao", sample=s)
approx_var_est(ys, pik, method="Berger", sample=s)
approx_var_est(ys, pik, method="Tille", sample=s)
approx_var_est(ys, pik, method="MateiTille1", sample=s)
approx_var_est(ys, pik, method="MateiTille2", sample=s)
approx_var_est(ys, pik, method="MateiTille3", sample=s)
approx_var_est(ys, pik, method="MateiTille4", sample=s)
approx_var_est(ys, pik, method="MateiTille5", sample=s)
approx_var_est(ys, pik, method="Brewer2", sample=s)
approx_var_est(ys, pik, method="Brewer3", sample=s)
approx_var_est(ys, pik, method="Brewer4", sample=s)