scps {BalancedSampling} | R Documentation |
Spatially Correlated Poisson Sampling
Description
Selects spatially balanced samples with prescribed inclusion probabilities from a finite population using Spatially Correlated Poisson Sampling (SCPS).
Usage
scps(prob, x, rand = NULL, type = "kdtree2", bucketSize = 50, eps = 1e-12)
lcps(prob, x, type = "kdtree2", bucketSize = 50, eps = 1e-12)
Arguments
prob |
A vector of length N with inclusion probabilities, or an integer > 1. If an integer n, then the sample will be drawn with equal probabilities n / N. |
x |
An N by p matrix of (standardized) auxiliary variables. Squared euclidean distance is used in the |
rand |
A vector of length N with random numbers. If this is supplied, the decision of each unit is taken with the corresponding random number. This makes it possible to coordinate the samples. |
type |
The method used in finding nearest neighbours.
Must be one of |
bucketSize |
The maximum size of the terminal nodes in the k-d-trees. |
eps |
A small value used to determine when an updated probability is close enough to 0.0 or 1.0. |
Details
If prob
sum to an integer n, a fixed sized sample (n) will be produced.
The implementation uses the maximal weight strategy, as specified in
Grafström (2012).
Coordinated SCPS
If rand
is supplied, coordinated SCPS will be performed.
The algorithm for coordinated SCPS differs from the SCPS algorithm, as
uncoordinated SCPS chooses a unit to update randomly, whereas coordinated SCPS
traverses the units in the supplied order.
This has a small impact on the efficiency of the algorithm for coordinated SCPS.
Locally Correlated Poisson Sampling (LCPS)
The method differs from SCPS as LPM1 differs from LPM2. In each step of the algorithm, the unit with the smallest updating distance is chosen as the deciding unit.
Value
A vector of selected indices in 1,2,...,N.
Functions
-
lcps()
:
k-d-trees
The type
s "kdtree" creates k-d-trees with terminal node bucket sizes
according to bucketSize
.
"kdtree0" creates a k-d-tree using a median split on alternating variables.
"kdtree1" creates a k-d-tree using a median split on the largest range.
"kdtree2" creates a k-d-tree using a sliding-midpoint split.
"notree" does a naive search for the nearest neighbour.
References
Friedman, J. H., Bentley, J. L., & Finkel, R. A. (1977). An algorithm for finding best matches in logarithmic expected time. ACM Transactions on Mathematical Software (TOMS), 3(3), 209-226.
Maneewongvatana, S., & Mount, D. M. (1999, December). It’s okay to be skinny, if your friends are fat. In Center for geometric computing 4th annual workshop on computational geometry (Vol. 2, pp. 1-8).
Grafström, A. (2012). Spatially correlated Poisson sampling. Journal of Statistical Planning and Inference, 142(1), 139-147.
Prentius, W. (2023). Locally correlated Poisson sampling. Environmetrics, e2832.
See Also
Other sampling:
cube()
,
hlpm2()
,
lcube()
,
lpm()
Examples
## Not run:
set.seed(12345);
N = 1000;
n = 100;
prob = rep(n/N, N);
x = matrix(runif(N * 2), ncol = 2);
s = scps(prob, x);
plot(x[, 1], x[, 2]);
points(x[s, 1], x[s, 2], pch = 19);
set.seed(12345);
prob = c(0.2, 0.25, 0.35, 0.4, 0.5, 0.5, 0.55, 0.65, 0.7, 0.9);
N = length(prob);
x = matrix(runif(N * 2), ncol = 2);
ep = rep(0L, N);
r = 10000L;
for (i in seq_len(r)) {
s = scps(prob, x);
ep[s] = ep[s] + 1L;
}
print(ep / r);
set.seed(12345);
N = 1000;
n = 100;
prob = rep(n/N, N);
x = matrix(runif(N * 2), ncol = 2);
scps(prob, x);
lcps(prob, x);
## End(Not run)