strata.bh {stratification}R Documentation

Stratification of a Population Given a Set of Boundaries

Description

The function strata.bh stratifies a population given a set of boundaries. It calculates the stratum sample sizes and the anticipated coefficient of variation or relative root mean squared error.

Usage

strata.bh(x, bh, n = NULL, CV = NULL, Ls = 3, certain = NULL,
          alloc = list(q1 = 0.5, q2 = 0, q3 = 0.5), takenone = 0, 
          bias.penalty = 1, takeall = 0, takeall.adjust = TRUE, 
          rh = rep(1, Ls), model = c("none", "loglinear", "linear",
          "random"), model.control = list())      

Arguments

x

A vector containing the values of the stratification variable X for every unit in the population.

bh

A vector of the L-1 stratum boundaries (b_1, b_2, \ldots, b_{L-1}) where L is the total number of strata (excluding the certainty stratum, if any). Therefore, if takenone=0 then L=Ls, and if takenone=1 then L=Ls+1.

n

A numeric: the target sample size. It has no default value. The argument n or the argument CV must be input.

CV

A numeric: the target coefficient of variation or relative root mean squared error if takenone=1. It has no default value. The argument CV or the argument n must be input.

Ls

A numeric: the number of sampled strata (take-none and certain strata are not counted in Ls). The default is 3.

certain

A vector giving the position, in the vector x, of the units that must be included in the sample (see stratification-package). By default certain is NULL, which means that no units are a priori chosen to be in the sample.

alloc

A list specifying the allocation scheme. The list must contain 3 numerics for the 3 exponents q1, q2 and q3 in the general allocation scheme (see stratification-package). The default is Neyman allocation (q1=q3=0.5 and q2=0)

takenone

A numeric: the number of take-none strata (0 or 1). The default is 0, i.e. no take-none stratum is included.

bias.penalty

A numeric between 0 and 1 giving the penalty for the bias in the anticipated mean squared error (MSE) of the survey estimator (see stratification-package). This argument is relevant only if takenone=1. The default is 1.

takeall

A numeric: the number of take-all strata (one of {0, 1, ..., Ls-1}). The default is 0, i.e. no take-all stratum is included.

takeall.adjust

A logical. If TRUE (the default), when n_h > N_h for a take-some stratum, the takeall argument is increased by one and the allocation is carried out again. This is done as long as n_h \leq N_h for every take-some stratum. If FALSE, no adjustment is made. Note: in other functions of the package stratification, this adjustment is not optional; it is made automatically (see stratification-package).

rh

A vector giving the anticipated response rates in each of the Ls sampled strata. A single number can be given if the rates do not vary among strata. The default is 1 in each stratum.

model

A character string identifying the model used to describe the discrepancy between the stratification variable X and the survey variable Y. It can be "none" if one assumes Y=X, "loglinear" for the loglinear model with mortality, "linear" for the heteroscedastic linear model or "random" for the random replacement model (see stratification-package for a description of these models). The default is "none".

model.control

A list of model parameters (see stratification-package). The default values of the parameters correspond to the model Y=X.

Value

Nh

A vector of length L containing the population sizes N_h, i.e. the number of units in each stratum.

nh

A vector of length L containing the sample sizes n_h, i.e. the number of units to sample in each stratum. See stratification-package for information about the rounding used to get these integer values.

n

The total sample size (sum(nh)).

nhnonint

A vector of length L containing the non-integer values of the sample sizes, obtained directly from applying the allocation rule (see stratification-package).

certain.info

A vector giving statistics for the certainty stratum (see stratification-package). It contains Nc, the number of units chosen a priori to be in the sample, and meanc, the anticipated mean of Y for these units.

opti.nh

The final value of the criteria to optimize (either the total sample size n if a target CV was given or the RRMSE if a target n was given) calculated with the integer stratum sample sizes nh.

opti.nhnonint

The final value of the criteria to optimize (either the total sample size n if a target CV was given or the RRMSE if a target n was given) calculated with the non-integer stratum sample sizes nhnonint.

meanh

A vector of length L containing the anticipated means of Y in each stratum.

varh

A vector of length L containing the anticipated variances of Y in each stratum.

mean

A numeric: the anticipated global mean value of Y.

RMSE

A numeric: the root mean squared error (or standard error if takenone=0) of the anticipated global mean of Y. This is defined as the squared root of: (bias.penalty x bias of the mean)^2 + variance of the mean.

RRMSE

A numeric: the anticipated relative root mean squared error (or coefficient of variation if takenone=0) for the mean of Y, i.e. RMSE divided by mean.

relativebias

A numeric: the anticipated relative bias of the estimator, i.e. (bias.penalty x bias of the mean) divided by mean. If takenone=0, this numeric is zero.

propbiasMSE

A numeric: the proportion of the MSE attributable to the bias of the estimator, i.e. (bias.penalty x bias of the mean)^2 divided by the MSE of the mean. If takenone=0, this numeric is zero.

stratumID

A factor, having the same length as the input x, which values are either 1, 2, ..., L or "certain". The value "certain" is given to units a priori chosen to be in the sample. This factor identifies, for each observation, the stratum to which it has been assigned.

takeall

The number of take-all strata in the final solution. Note: It is possible that n_h=N_h for non take-all strata because the condition for an automatic addition of a take-all stratum is n_h>N_h.

call

The function call (object of class "call").

date

A character string that contains the system date and time when the function ended.

args

A list of all the argument values input to the function or set by default.

Author(s)

Sophie Baillargeon Sophie.Baillargeon@mat.ulaval.ca and
Louis-Paul Rivest Louis-Paul.Rivest@mat.ulaval.ca

References

Baillargeon, S. and Rivest L.-P. (2011). The construction of stratified designs in R with the package stratification. Survey Methodology, 37(1), 53-65.

See Also

print.strata, plot.strata, strata.cumrootf, strata.geo, strata.LH

Examples

adjust <- strata.geo(x=USbanks, CV=0.01, Ls=4, alloc=c(0.35,0.35,0))
adjust
adjust$nhnonint
noadjust <- strata.bh(x=USbanks, bh=adjust$bh, CV=0.01, Ls=4,
            alloc=c(0.35,0.35,0), takeall=0, takeall.adjust=FALSE)
noadjust
noadjust$nhnonint
# without the adjustment for a take-all stratum, n is smaller than
# with the adjustment, but the target CV is not reached.

[Package stratification version 2.2-7 Index]