PikHol {TeachingSampling} | R Documentation |
Optimal Inclusion Probabilities Under Multi-purpose Sampling
Description
Computes the population vector of optimal inclusion probabilities under the Holmbergs's Approach
Usage
PikHol(n, sigma, e, Pi)
Arguments
n |
Vector of optimal sample sizes for each of the characteristics of interest. |
sigma |
A matrix containing the size measures for each characteristics of interest. |
e |
Maximum allowed error under the ANOREL approach. |
Pi |
Matrix of first order inclusion probabilities. By default, this probabilites are proportional to each sigma. |
Details
Assuming that all of the characteristic of interest are equally important, the Holmberg's sampling design yields the following inclusion probabilities
\pi_{(opt)k}=\frac{n^*\sqrt{a_{qk}}}{\sum_{k\in U}\sqrt{a_{qk}}}
where
n^*\geq \frac{(\sum_{k\in U}\sqrt{a_{qk}})^2}{(1+c)Q+\sum_{k\in U}a_{qk}}
and
a_{qk}= \sum_{q=1}^Q \frac{\sigma^2_{qk}}{\sum_{k\in U}\left( \frac{1}{\pi_{qk}}-1\right)\sigma^2_{qk}}
Note that \sigma^2_{qk}
is a size measure associated with the k-th element in the q-th characteristic of interest.
Value
The function returns a vector of inclusion probabilities.
Author(s)
Hugo Andres Gutierrez Rojas hagutierrezro@gmail.com
References
Holmberg, A. (2002), On the Choice of Sampling Design under GREG Estimation in Multiparameter Surveys.
RD Department, Statistics Sweden.
Sarndal, C-E. and Swensson, B. and Wretman, J. (1992), Model Assisted Survey Sampling. Springer.
Gutierrez, H. A. (2009), Estrategias de muestreo: Diseno de encuestas y estimacion de parametros.
Editorial Universidad Santo Tomas
Examples
#######################
#### First example ####
#######################
# Uses the Lucy data to draw an otpimal sample
# in a multipurpose survey context
data(Lucy)
attach(Lucy)
# Different sample sizes for two characteristics of interest: Employees and Taxes
N <- dim(Lucy)[1]
n <- c(350,400)
# The size measure is the same for both characteristics of interest,
# but the relationship in between is different
sigy1 <- sqrt(Income^(1))
sigy2 <- sqrt(Income^(2))
# The matrix containign the size measures for each characteristics of interest
sigma<-cbind(sigy1,sigy2)
# The vector of optimal inclusion probabilities under the Holmberg's approach
Piks<-PikHol(n,sigma,0.03)
# The optimal sample size is given by the sum of piks
n=round(sum(Piks))
# Performing the S.piPS function in order to select the optimal sample of size n
res<-S.piPS(n,Piks)
sam <- res[,1]
# The information about the units in the sample is stored in an object called data
data <- Lucy[sam,]
attach(data)
names(data)
# Pik.s is the vector of inclusion probability of every single unit
# in the selected sample
Pik.s <- res[,2]
# The variables of interest are: Income, Employees and Taxes
# This information is stored in a data frame called estima
estima <- data.frame(Income, Employees, Taxes)
E.piPS(estima,Pik.s)
########################
#### Second example ####
########################
# We can define our own first inclusion probabilities
data(Lucy)
attach(Lucy)
N <- dim(Lucy)[1]
n <- c(350,400)
sigy1 <- sqrt(Income^(1))
sigy2 <- sqrt(Income^(2))
sigma<-cbind(sigy1,sigy2)
pikas <- cbind(rep(400/N, N), rep(400/N, N))
Piks<-PikHol(n,sigma,0.03, pikas)
n=round(sum(Piks))
n
res<-S.piPS(n,Piks)
sam <- res[,1]
data <- Lucy[sam,]
attach(data)
names(data)
Pik.s <- res[,2]
estima <- data.frame(Income, Employees, Taxes)
E.piPS(estima,Pik.s)