opt {stratallo} | R Documentation |
Optimum Sample Allocation in Stratified Sampling
Description
A classical problem in survey methodology in stratified sampling is optimum
sample allocation. This problem is formulated as determination of strata
sample sizes that minimize the variance of the
stratified \pi
estimator of the population total (or mean) of a
given study variable, under certain constraints on sample sizes in strata.
The opt()
user function solves the following optimum sample allocation
problem, formulated below in the language of mathematical optimization.
Minimize
f(x_1,\ldots,x_H) = \sum_{h=1}^H \frac{A^2_h}{x_h}
subject to
\sum_{h=1}^H x_h = n
m_h \leq x_h \leq M_h, \quad h = 1,\ldots,H,
where n > 0,\, A_h > 0,\, m_h > 0,\, M_h > 0
, such that
m_h < M_h,\, h = 1,\ldots,H
, and
\sum_{h=1}^H m_h \leq n \leq \sum_{h=1}^H M_h
, are given numbers.
The minimization is on \mathbb R_+^H
.
The inequality constraints are optional and user can choose whether and how
they are to be added to the optimization problem. This is achieved by the
proper use of m
and M
arguments of this function, according to the
following rules:
no inequality constraints imposed: both
m
andM
must be both set toNULL
(default).one-sided lower bounds
m_h,\, h = 1,\ldots,H
, imposed: lower bounds are specified withm
, whileM
is set toNULL
.one-sided upper bounds
M_h,\, h = 1,\ldots,H
, imposed: upper bounds are specified withM
, whilem
is set toNULL
.box-constraints imposed: lower and upper bounds must be specified with
m
andM
, respectively.
Usage
opt(n, A, m = NULL, M = NULL, M_algorithm = "rna")
Arguments
n |
( |
A |
( |
m |
( |
M |
( |
M_algorithm |
( |
Details
The opt()
function makes use of several allocation algorithms, depending
on which of the inequality constraints should be taken into account in the
optimization problem. Each algorithm is implemented in a separate R
function that in general should not be used directly by the end user.
The following is the list with the algorithms that are used along with the
name of the function that implements a given algorithm. See the description
of a specific function to find out more about the corresponding algorithm.
one-sided lower-bounds
m_h,\, h = 1,\ldots,H
:-
LRNA
-rna()
-
one-sided upper-bounds
M_h,\, h = 1,\ldots,H
:box constraints
m_h, M_h,\, h = 1,\ldots,H
:-
RNABOX
-rnabox()
-
Value
Numeric vector with optimal sample allocations in strata.
Note
If no inequality constraints are added, the allocation is given by the Neyman allocation as:
x_h = A_h \frac{n}{\sum_{i=1}^H A_i}, \quad h = 1,\ldots,H.
For stratified \pi
estimator of the population total with
stratified simple random sampling without replacement design in use,
the parameters of the objective function f
are:
A_h = N_h S_h, \quad h = 1,\ldots,H,
where N_h
is the size of stratum h
and S_h
denotes
standard deviation of a given study variable in stratum h
.
References
Särndal, C.-E., Swensson, B. and Wretman, J. (1992). Model Assisted Survey Sampling, Springer, New York.
See Also
optcost()
, rna()
, sga()
, sgaplus()
, coma()
, rnabox()
.
Examples
A <- c(3000, 4000, 5000, 2000)
m <- c(100, 90, 70, 50)
M <- c(300, 400, 200, 90)
# One-sided lower bounds.
opt(n = 340, A = A, m = m)
opt(n = 400, A = A, m = m)
opt(n = 700, A = A, m = m)
# One-sided upper bounds.
opt(n = 190, A = A, M = M)
opt(n = 700, A = A, M = M)
# Box-constraints.
opt(n = 340, A = A, m = m, M = M)
opt(n = 500, A = A, m = m, M = M)
xopt <- opt(n = 800, A = A, m = m, M = M)
xopt
var_st(x = xopt, A = A, A0 = 45000) # Value of the variance for allocation xopt.
# Execution-time comparisons of different algorithms with microbenchmark R package.
## Not run:
N <- pop969[, "N"]
S <- pop969[, "S"]
A <- N * S
nfrac <- c(0.005, seq(0.05, 0.95, 0.05))
n <- setNames(as.integer(nfrac * sum(N)), nfrac)
lapply(
n,
function(ni) {
microbenchmark::microbenchmark(
RNA = opt(ni, A, M = N, M_algorithm = "rna"),
SGA = opt(ni, A, M = N, M_algorithm = "sga"),
SGAPLUS = opt(ni, A, M = N, M_algorithm = "sgaplus"),
COMA = opt(ni, A, M = N, M_algorithm = "coma"),
times = 200,
unit = "us"
)
}
)
## End(Not run)