direct {sae}R Documentation

Direct estimators.

Description

This function calculates direct estimators of domain means.

Usage

direct(y, dom, sweight, domsize, data, replace = FALSE)

Arguments

y

vector specifying the individual values of the variable for which we want to estimate the domain means.

dom

vector or factor (same size as y) with domain codes.

sweight

optional vector (same size as y) with sampling weights. When this argument is not included, by default estimators are obtained under simple random sampling (SRS).

domsize

D*2 data frame with domain codes in the first column and the corresponding domain population sizes in the second column. This argument is not required when sweight is not included and replace=TRUE (SRS with replacement).

data

optional data frame containing the variables named in y, dom and sweight. By default the variables are taken from the environment from which direct is called.

replace

logical variable with default value FALSE for random sampling without replacement within each domain is considered and TRUE for random sampling with replacement within each domain.

Value

The function returns a data frame of size D*5 with the following columns:

Domain

domain codes in ascending order.

SampSize

domain sample sizes.

Direct

direct estimators of domain means of variable y.

SD

estimated standard deviations of domain direct estimators. If sampling design is SRS or Poisson sampling, estimated variances are unbiased. Otherwise, estimated variances are obtained under the approximation that second order inclusion probabilities are the product of first order inclusion probabilities.

CV

absolute value of percent coefficients of variation of domain direct estimators.

Cases with NA values in y, dom or sweight are ignored.

References

- Cochran, W.G. (1977). Sampling techniques. Wiley, New York.

- Rao, J.N.K. (2003). Small Area Estimation. Wiley, London.

- Sarndal, C.E., Swensson, B. and Wretman, J. (1992). Model Assisted Survey Sampling. Springer-Verlag.

See Also

pssynt for post-stratified synthetic estimator, ssd for sample size dependent estimator.

In case that the sampling design is known, see packages survey or sampling for more exact variance estimation.

Examples

# Load data set with synthetic income data for provinces (domains)
data(incomedata)

# Load population sizes of provinces
data(sizeprov)   

# Compute Horvitz-Thompson direct estimator of mean income for each 
# province under random sampling without replacement within each province.
result1 <- direct(y=income, dom=prov, sweight=weight,
                   domsize=sizeprov[,2:3], data=incomedata)
result1

# The same but using province labels as domain codes
result2 <- direct(y=incomedata$income, dom=incomedata$provlab,
                  sweight=incomedata$weight, domsize=sizeprov[,c(1,3)])
result2

# The same, under SRS without replacement within each province.
result3 <- direct(y=income ,dom=provlab, domsize=sizeprov[,c(1,3)],
                  data=incomedata)
result3

# Compute direct estimator of mean income for each province
# under SRS with replacement within each province
result4 <- direct(y=income, dom=provlab, data=incomedata, replace=TRUE)
result4

[Package sae version 1.3 Index]