BKA {Frames2}R Documentation

Bankier-Kalton-Anderson estimator

Description

Produces estimates for population total and mean using the Bankier-Kalton-Anderson estimator from survey data obtained from a dual frame sampling design. Confidence intervals are also computed, if required.

Usage

BKA(ysA, ysB, pi_A, pi_B, pik_ab_B, pik_ba_A, domains_A, domains_B, 
conf_level = NULL)

Arguments

ysA

A numeric vector of length nAn_A or a numeric matrix or data frame of dimensions nAn_A x cc containing information about variable(s) of interest from sAs_A.

ysB

A numeric vector of length nBn_B or a numeric matrix or data frame of dimensions nBn_B x cc containing information about variable(s) of interest from sBs_B.

pi_A

A numeric vector of length nAn_A or a square numeric matrix of dimension nAn_A containing first order or first and second order inclusion probabilities for units included in sAs_A.

pi_B

A numeric vector of length nBn_B or a square numeric matrix of dimension nBn_B containing first order or first and second order inclusion probabilities for units included in sBs_B.

pik_ab_B

A numeric vector of size nAn_A containing first order inclusion probabilities according to sampling design in frame B for units belonging to overlap domain that have been selected in sAs_A.

pik_ba_A

A numeric vector of size nBn_B containing first order inclusion probabilities according to sampling design in frame A for units belonging to overlap domain that have been selected in sBs_B.

domains_A

A character vector of size nAn_A indicating the domain each unit from sAs_A belongs to. Possible values are "a" and "ab".

domains_B

A character vector of size nBn_B indicating the domain each unit from sBs_B belongs to. Possible values are "b" and "ba".

conf_level

(Optional) A numeric value indicating the confidence level for the confidence intervals, if desired.

Details

BKA estimator of population total is given by

Y^BKA=isAd~iAyi+isBd~iByi\hat{Y}_{BKA} = \sum_{i \in s_A}\tilde{d}_i^Ay_i + \sum_{i \in s_B}\tilde{d}_i^By_i

where d~iA={diAif ia(1/diA+1/diB)1if iab\tilde{d}_i^A =\left\{\begin{array}{lcc} d_i^A & \textrm{if } i \in a\\ (1/d_i^A + 1/d_i^B)^{-1} & \textrm{if } i \in ab \end{array} \right. and d~iB={diBif ib(1/diA+1/diB)1if iba\tilde{d}_i^B =\left\{\begin{array}{lcc} d_i^B & \textrm{if } i \in b\\ (1/d_i^A + 1/d_i^B)^{-1} & \textrm{if } i \in ba \end{array} \right. being diAd_i^A and diBd_i^B the design weights, obtained as the inverse of the first order inclusion probabilities, that is, diA=1/πiAd_i^A = 1/\pi_i^A and diB=1/πiBd_i^B = 1/\pi_i^B.

To estimate variance of this estimator, one uses following approach proposed by Rao and Skinner (1996)

V^(Y^BKA)=V^(isAz~iA)+V^(isBz~iB)\hat{V}(\hat{Y}_{BKA}) = \hat{V}(\sum_{i \in s_A}\tilde{z}_i^A) + \hat{V}(\sum_{i \in s_B}\tilde{z}_i^B)

with z~iA=δi(a)yi+(1δi(a))yiπiA/(πiA+πiB)\tilde{z}_i^A = \delta_i(a)y_i + (1 - \delta_i(a))y_i\pi_i^A/(\pi_i^A + \pi_i^B) and z~iB=δi(b)yi+(1δi(b))yiπiB/(πiA+πiB)\tilde{z}_i^B = \delta_i(b)y_i + (1 - \delta_i(b))y_i\pi_i^B/(\pi_i^A + \pi_i^B), being δi(a)\delta_i(a) and δi(b)\delta_i(b) the indicator variables for domain aa and domain bb, respectively. If both first and second order probabilities are known, variances and covariances involved in calculation of β^\hat{\beta} and V^(Y^FB)\hat{V}(\hat{Y}_{FB}) are estimated using functions VarHT and CovHT, respectively. If only first order probabilities are known, variances are estimated using Deville's method and covariances are estimated using following expression

Cov^(X^,Y^)=V^(X+Y)V^(X)V^(Y)2\widehat{Cov}(\hat{X}, \hat{Y}) = \frac{\hat{V}(X + Y) - \hat{V}(X) - \hat{V}(Y)}{2}

Value

BKA returns an object of class "EstimatorDF" which is a list with, at least, the following components:

Call

the matched call.

Est

total and mean estimation for main variable(s).

VarEst

variance estimation for main variable(s).

If parameter conf_level is different from NULL, object includes component

ConfInt

total and mean estimation and confidence intervals for main variables(s).

In addition, components TotDomEst and MeanDomEst are available when estimator is based on estimators of the domains. Component Param shows value of parameters involded in calculation of the estimator (if any). By default, only Est component (or ConfInt component, if parameter conf_level is different from NULL) is shown. It is possible to access to all the components of the objects by using function summary.

References

Bankier, M. D. (1986) Estimators Based on Several Stratified Samples With Applications to Multiple Frame Surveys. Journal of the American Statistical Association, Vol. 81, 1074 - 1079.

Kalton, G. and Anderson, D. W. (1986) Sampling Rare Populations. Journal of the Royal Statistical Society, Ser. A, Vol. 149, 65 - 82.

Rao, J. N. K. and Skinner, C. J. (1996) Estimation in Dual Frame Surveys with Complex Designs. Proceedings of the Survey Method Section, Statistical Society of Canada, 63 - 68.

Skinner, C. J. and Rao, J. N. K. (1996) Estimation in Dual Frame Surveys with Complex Designs. Journal of the American Statistical Association, Vol. 91, 433, 349 - 356.

See Also

JackBKA

Examples

data(DatA)
data(DatB)
data(PiklA)
data(PiklB)

#Let calculate BKA estimator for population total for variable Leisure
BKA(DatA$Lei, DatB$Lei, PiklA, PiklB, DatA$ProbB, DatB$ProbA, 
DatA$Domain, DatB$Domain)

#Now, let calculate BKA estimator and a 90% confidence interval for population 
#total for variable Feeding considering only first order inclusion probabilities
BKA(DatA$Feed, DatB$Feed, DatA$ProbA, DatB$ProbB, DatA$ProbB, 
DatB$ProbA, DatA$Domain, DatB$Domain, 0.90)

[Package Frames2 version 0.2.1 Index]