lpm {BalancedSampling} | R Documentation |
The (Local) Pivotal Methods
Description
Selects spatially balanced samples with prescribed inclusion probabilities from a finite population using the Local Pivotal Method 1 (LPM1).
Usage
lpm(prob, x, type = "kdtree2", bucketSize = 50, eps = 1e-12)
lpm1(prob, x, type = "kdtree2", bucketSize = 50, eps = 1e-12)
lpm2(prob, x, type = "kdtree2", bucketSize = 50, eps = 1e-12)
lpm1s(prob, x, type = "kdtree2", bucketSize = 50, eps = 1e-12)
spm(prob, eps = 1e-12)
rpm(prob, eps = 1e-12)
Arguments
prob |
A vector of length N with inclusion probabilities, or an integer > 1. If an integer n, then the sample will be drawn with equal probabilities n / N. |
x |
An N by p matrix of (standardized) auxiliary variables. Squared euclidean distance is used in the |
type |
The method used in finding nearest neighbours.
Must be one of |
bucketSize |
The maximum size of the terminal nodes in the k-d-trees. |
eps |
A small value used to determine when an updated probability is close enough to 0.0 or 1.0. |
Details
If prob
sum to an integer n, a fixed sized sample (n) will be produced.
For spm
and rpm
, prob
must be a vector of inclusion probabilities.
If equal inclusion probabilities is wanted, this can be produced by
rep(n / N, N)
.
The available pivotal methods are:
-
lpm1
: The Local Pivotal Mehtod 1 (Grafström et al., 2012). Updates only units which are mutual nearest neighbours. Selects such a pair at random. -
lpm2
,lpm
: The Local Pivotal Method 2 (Grafström et al., 2012). Selects a unit at random, which competes with this units nearest neighbour. -
lpm1s
: The Local Pivotal Method 1 search: (Prentius, 2023). Updates only units which are mutual nearest neighbours. Selects such a pair by branching the remaining units, giving higher probabilities to update a pair with a long branch. This changes the algorithm of lpm1, but makes it faster. -
spm
: The Sequential Pivotal Method. Selects the two units with smallest indices to compete against each other. If the list is ordered, the algorithm is similar to systematic sampling. -
rpm
: The Random Pivotal Method. Selects two units at random to compete against each other. Produces a design with high entropy.
Value
A vector of selected indices in 1,2,...,N.
Functions
-
lpm1()
: -
lpm2()
: -
lpm1s()
: -
spm()
: -
rpm()
:
k-d-trees
The type
s "kdtree" creates k-d-trees with terminal node bucket sizes
according to bucketSize
.
"kdtree0" creates a k-d-tree using a median split on alternating variables.
"kdtree1" creates a k-d-tree using a median split on the largest range.
"kdtree2" creates a k-d-tree using a sliding-midpoint split.
"notree" does a naive search for the nearest neighbour.
References
Friedman, J. H., Bentley, J. L., & Finkel, R. A. (1977). An algorithm for finding best matches in logarithmic expected time. ACM Transactions on Mathematical Software (TOMS), 3(3), 209-226.
Deville, J.-C., & Tillé, Y. (1998). Unequal probability sampling without replacement through a splitting method. Biometrika 85, 89-101.
Maneewongvatana, S., & Mount, D. M. (1999, December). It’s okay to be skinny, if your friends are fat. In Center for geometric computing 4th annual workshop on computational geometry (Vol. 2, pp. 1-8).
Chauvet, G. (2012). On a characterization of ordered pivotal sampling. Bernoulli, 18(4), 1320-1340.
Grafström, A., Lundström, N.L.P. & Schelin, L. (2012). Spatially balanced sampling through the Pivotal method. Biometrics 68(2), 514-520.
Lisic, J. J., & Cruze, N. B. (2016, June). Local pivotal methods for large surveys. In Proceedings of the Fifth International Conference on Establishment Surveys.
Prentius, W. (2023) Manuscript.
See Also
Other sampling:
cube()
,
hlpm2()
,
lcube()
,
scps()
Examples
## Not run:
set.seed(12345);
N = 1000;
n = 100;
prob = rep(n/N, N);
x = matrix(runif(N * 2), ncol = 2);
s = lpm2(prob, x);
plot(x[, 1], x[, 2]);
points(x[s, 1], x[s, 2], pch = 19);
set.seed(12345);
prob = c(0.2, 0.25, 0.35, 0.4, 0.5, 0.5, 0.55, 0.65, 0.7, 0.9);
N = length(prob);
x = matrix(runif(N * 2), ncol = 2);
ep = rep(0L, N);
r = 10000L;
for (i in seq_len(r)) {
s = lpm2(prob, x);
ep[s] = ep[s] + 1L;
}
print(ep / r);
set.seed(12345);
N = 1000;
n = 100;
prob = rep(n/N, N);
x = matrix(runif(N * 2), ncol = 2);
lpm1(prob, x);
lpm2(prob, x);
lpm1s(prob, x);
spm(prob);
rpm(prob);
## End(Not run)