lcube {BalancedSampling} | R Documentation |
The Local Cube method
Description
Selects doubly balanced samples with prescribed inclusion probabilities from a finite population using the Local Cube method.
Usage
lcube(prob, Xspread, Xbal, type = "kdtree2", bucketSize = 50, eps = 1e-12)
lcubestratified(
prob,
Xspread,
Xbal,
integerStrata,
type = "kdtree2",
bucketSize = 50,
eps = 1e-12
)
Arguments
prob |
A vector of length N with inclusion probabilities. |
Xspread |
An N by p matrix of (standardized) auxiliary variables. Squared euclidean distance is used in the |
Xbal |
An N by q matrix of balancing auxiliary variables. |
type |
The method used in finding nearest neighbours.
Must be one of |
bucketSize |
The maximum size of the terminal nodes in the k-d-trees. |
eps |
A small value used to determine when an updated probability is close enough to 0.0 or 1.0. |
integerStrata |
An integer vector of length N with stratum numbers. |
Details
If prob
sum to an integer n, and prob
is included as the first
balancing variable, a fixed sized sample (n) will be produced.
Stratified lcube
For lcubestratified
, prob
is automatically inserted as a balancing variable.
The stratified version uses the fast flight Cube method and pooling of landing phases.
Value
A vector of selected indices in 1,2,...,N.
Functions
-
lcubestratified()
:
k-d-trees
The type
s "kdtree" creates k-d-trees with terminal node bucket sizes
according to bucketSize
.
"kdtree0" creates a k-d-tree using a median split on alternating variables.
"kdtree1" creates a k-d-tree using a median split on the largest range.
"kdtree2" creates a k-d-tree using a sliding-midpoint split.
"notree" does a naive search for the nearest neighbour.
References
Deville, J. C. and Tillé, Y. (2004). Efficient balanced sampling: the cube method. Biometrika, 91(4), 893-912.
Chauvet, G. and Tillé, Y. (2006). A fast algorithm for balanced sampling. Computational Statistics, 21(1), 53-62.
Chauvet, G. (2009). Stratified balanced sampling. Survey Methodology, 35, 115-119.
Grafström, A. and Tillé, Y. (2013). Doubly balanced spatial sampling with spreading and restitution of auxiliary totals. Environmetrics, 24(2), 120-131
See Also
Other sampling:
cube()
,
hlpm2()
,
lpm()
,
scps()
Examples
## Not run:
set.seed(12345);
N = 1000;
n = 100;
prob = rep(n/N, N);
x = matrix(runif(N * 2), ncol = 2);
xspr = matrix(runif(N * 2), ncol = 2);
s = lcube(prob, xspr, cbind(prob, x));
plot(x[, 1], x[, 2]);
points(x[s, 1], x[s, 2], pch = 19);
set.seed(12345);
N = 1000;
n = 100;
prob = rep(n/N, N);
x = matrix(runif(N * 2), ncol = 2);
xspr = matrix(runif(N * 2), ncol = 2);
strata = c(rep(1L, 100), rep(2L, 200), rep(3L, 300), rep(4L, 400));
s = lcubestratified(prob, xspr, x, strata);
plot(x[, 1], x[, 2]);
points(x[s, 1], x[s, 2], pch = 19);
set.seed(12345);
prob = c(0.2, 0.25, 0.35, 0.4, 0.5, 0.5, 0.55, 0.65, 0.7, 0.9);
N = length(prob);
x = matrix(runif(N * 2), ncol = 2);
xspr = matrix(runif(N * 2), ncol = 2);
ep = rep(0L, N);
r = 10000L;
for (i in seq_len(r)) {
s = lcube(prob, xspr, cbind(prob, x));
ep[s] = ep[s] + 1L;
}
print(ep / r);
## End(Not run)