cube {BalancedSampling} R Documentation

## Cube method (Balanced sampling)

### Description

This is a fast implementation of the cube method. To have a fixed sample size, include the inclusion probabilities as a balancing variable in Xbal and make sure the inclusion probabilities sum to a positive integer. Landing is done by dropping balancing variables (from rightmost column, so keep inclusion probabilities in first column to guarantee fixed size).

### Usage

cube(prob,Xbal)


### Arguments

 prob vector of length N with inclusion probabilities Xbal matrix of balancing auxiliary variables of N rows and r columns

### Value

Returns a vector of selected indexes in 1,2,...,N.

### References

Deville, J. C. and TillĂ©, Y. (2004). Efficient balanced sampling: the cube method. Biometrika, 91(4), 893-912.

Chauvet, G. and TillĂ©, Y. (2006). A fast algorithm for balanced sampling. Computational Statistics, 21(1), 53-62.

### Examples

## Not run:
# Example 1
# Select sample
set.seed(12345);
N = 1000; # population size
n = 100; # sample size
p = rep(n/N,N); # inclusion probabilities
X = cbind(p,runif(N),runif(N)); # matrix of auxiliary variables
s = cube(p,X); # select sample

# Example 2
# Check inclusion probabilities
set.seed(12345);
p = c(0.2, 0.25, 0.35, 0.4, 0.5, 0.5, 0.55, 0.65, 0.7, 0.9); # prescribed inclusion probabilities
N = length(p); # population size
ep = rep(0,N); # empirical inclusion probabilities
nrs = 10000; # repetitions
for(i in 1:nrs){
s = cube(p,cbind(p));
ep[s]=ep[s]+1;
}
print(ep/nrs);

# Example 3
# How fast is it?
# Let's check with N = 100 000 and 5 balancing variables
set.seed(12345);
N = 100000; # population size
n = 100; # sample size
p = rep(n/N,N); # inclusion probabilities
# matrix of 5 auxiliary variables
X = cbind(p,runif(N),runif(N),runif(N),runif(N));
system.time(cube(p,X));

## End(Not run)


[Package BalancedSampling version 1.6.3 Index]