strata.bh {stratification}R Documentation

Stratification of a Population Given a Set of Boundaries

Description

The function strata.bh stratifies a population given a set of boundaries. It calculates the stratum sample sizes and the anticipated coefficient of variation or relative root mean squared error.

Usage

strata.bh(x, bh, n = NULL, CV = NULL, Ls = 3, certain = NULL,
          alloc = list(q1 = 0.5, q2 = 0, q3 = 0.5), takenone = 0, 
          bias.penalty = 1, takeall = 0, takeall.adjust = TRUE, 
          rh = rep(1, Ls), model = c("none", "loglinear", "linear",
          "random"), model.control = list())      

Arguments

x

A vector containing the values of the stratification variable XX for every unit in the population.

bh

A vector of the L1L-1 stratum boundaries (b1,b2,,bL1)(b_1, b_2, \ldots, b_{L-1}) where LL is the total number of strata (excluding the certainty stratum, if any). Therefore, if takenone=0 then LL=Ls, and if takenone=1 then LL=Ls+1.

n

A numeric: the target sample size. It has no default value. The argument n or the argument CV must be input.

CV

A numeric: the target coefficient of variation or relative root mean squared error if takenone=1. It has no default value. The argument CV or the argument n must be input.

Ls

A numeric: the number of sampled strata (take-none and certain strata are not counted in Ls). The default is 3.

certain

A vector giving the position, in the vector x, of the units that must be included in the sample (see stratification-package). By default certain is NULL, which means that no units are a priori chosen to be in the sample.

alloc

A list specifying the allocation scheme. The list must contain 3 numerics for the 3 exponents q1, q2 and q3 in the general allocation scheme (see stratification-package). The default is Neyman allocation (q1=q3=0.5 and q2=0)

takenone

A numeric: the number of take-none strata (0 or 1). The default is 0, i.e. no take-none stratum is included.

bias.penalty

A numeric between 0 and 1 giving the penalty for the bias in the anticipated mean squared error (MSE) of the survey estimator (see stratification-package). This argument is relevant only if takenone=1. The default is 1.

takeall

A numeric: the number of take-all strata (one of {0, 1, ..., Ls-1}). The default is 0, i.e. no take-all stratum is included.

takeall.adjust

A logical. If TRUE (the default), when nh>Nhn_h > N_h for a take-some stratum, the takeall argument is increased by one and the allocation is carried out again. This is done as long as nhNhn_h \leq N_h for every take-some stratum. If FALSE, no adjustment is made. Note: in other functions of the package stratification, this adjustment is not optional; it is made automatically (see stratification-package).

rh

A vector giving the anticipated response rates in each of the Ls sampled strata. A single number can be given if the rates do not vary among strata. The default is 1 in each stratum.

model

A character string identifying the model used to describe the discrepancy between the stratification variable XX and the survey variable YY. It can be "none" if one assumes Y=XY=X, "loglinear" for the loglinear model with mortality, "linear" for the heteroscedastic linear model or "random" for the random replacement model (see stratification-package for a description of these models). The default is "none".

model.control

A list of model parameters (see stratification-package). The default values of the parameters correspond to the model Y=XY=X.

Value

Nh

A vector of length LL containing the population sizes NhN_h, i.e. the number of units in each stratum.

nh

A vector of length LL containing the sample sizes nhn_h, i.e. the number of units to sample in each stratum. See stratification-package for information about the rounding used to get these integer values.

n

The total sample size (sum(nh)).

nhnonint

A vector of length LL containing the non-integer values of the sample sizes, obtained directly from applying the allocation rule (see stratification-package).

certain.info

A vector giving statistics for the certainty stratum (see stratification-package). It contains Nc, the number of units chosen a priori to be in the sample, and meanc, the anticipated mean of YY for these units.

opti.nh

The final value of the criteria to optimize (either the total sample size nn if a target CV was given or the RRMSE if a target n was given) calculated with the integer stratum sample sizes nh.

opti.nhnonint

The final value of the criteria to optimize (either the total sample size nn if a target CV was given or the RRMSE if a target n was given) calculated with the non-integer stratum sample sizes nhnonint.

meanh

A vector of length LL containing the anticipated means of YY in each stratum.

varh

A vector of length LL containing the anticipated variances of YY in each stratum.

mean

A numeric: the anticipated global mean value of YY.

RMSE

A numeric: the root mean squared error (or standard error if takenone=0) of the anticipated global mean of YY. This is defined as the squared root of: (bias.penalty x bias of the mean)^2 + variance of the mean.

RRMSE

A numeric: the anticipated relative root mean squared error (or coefficient of variation if takenone=0) for the mean of YY, i.e. RMSE divided by mean.

relativebias

A numeric: the anticipated relative bias of the estimator, i.e. (bias.penalty x bias of the mean) divided by mean. If takenone=0, this numeric is zero.

propbiasMSE

A numeric: the proportion of the MSE attributable to the bias of the estimator, i.e. (bias.penalty x bias of the mean)^2 divided by the MSE of the mean. If takenone=0, this numeric is zero.

stratumID

A factor, having the same length as the input x, which values are either 1, 2, ..., LL or "certain". The value "certain" is given to units a priori chosen to be in the sample. This factor identifies, for each observation, the stratum to which it has been assigned.

takeall

The number of take-all strata in the final solution. Note: It is possible that nh=Nhn_h=N_h for non take-all strata because the condition for an automatic addition of a take-all stratum is nh>Nhn_h>N_h.

call

The function call (object of class "call").

date

A character string that contains the system date and time when the function ended.

args

A list of all the argument values input to the function or set by default.

Author(s)

Sophie Baillargeon Sophie.Baillargeon@mat.ulaval.ca and
Louis-Paul Rivest Louis-Paul.Rivest@mat.ulaval.ca

References

Baillargeon, S. and Rivest L.-P. (2011). The construction of stratified designs in R with the package stratification. Survey Methodology, 37(1), 53-65.

See Also

print.strata, plot.strata, strata.cumrootf, strata.geo, strata.LH

Examples

adjust <- strata.geo(x=USbanks, CV=0.01, Ls=4, alloc=c(0.35,0.35,0))
adjust
adjust$nhnonint
noadjust <- strata.bh(x=USbanks, bh=adjust$bh, CV=0.01, Ls=4,
            alloc=c(0.35,0.35,0), takeall=0, takeall.adjust=FALSE)
noadjust
noadjust$nhnonint
# without the adjustment for a take-all stratum, n is smaller than
# with the adjustment, but the target CV is not reached.

[Package stratification version 2.2-7 Index]