JackMLDF {Frames2}R Documentation

Confidence intervals for MLDF estimator based on jackknife method

Description

Calculates confidence intervals for MLDF estimator using jackknife procedure

Usage

JackMLDF (ysA, ysB, pik_A, pik_B, domains_A, domains_B, xsA, xsB, xA, xB, ind_samA, 
ind_samB, ind_domA, ind_domB, N, conf_level, sdA = "srs", sdB = "srs", strA = NULL, 
strB = NULL, clusA = NULL, clusB = NULL, fcpA = FALSE, fcpB = FALSE)

Arguments

ysA

A data frame containing information about one or more factors, each one of dimension nAn_A, collected from sAs_A.

ysB

A data frame containing information about one or more factors, each one of dimension nBn_B, collected from sBs_B.

pik_A

A numeric vector of length nAn_A containing first order inclusion probabilities for units included in sAs_A.

pik_B

A numeric vector of length nBn_B containing first order inclusion probabilities for units included in sBs_B.

domains_A

A character vector of size nAn_A indicating the domain each unit from sAs_A belongs to. Possible values are "a" and "ab".

domains_B

A character vector of size nBn_B indicating the domain each unit from sBs_B belongs to. Possible values are "b" and "ba".

xsA

A numeric vector of length nAn_A or a numeric matrix or data frame of dimensions nAn_A x mm, with mm the number of auxiliary variables, containing auxiliary information in frame A for units included in sAs_A.

xsB

A numeric vector of length nBn_B or a numeric matrix or data frame of dimensions nBn_B x mm, with mm the number of auxiliary variables, containing auxiliary information in frame B for units included in sBs_B.

xA

A numeric vector or length NAN_A or a numeric matrix or data frame of dimensions NAN_A x mAm_A, with mAm_A the number of auxiliary variables in frame A, containing auxiliary information for the units in frame A.

xB

A numeric vector or length NBN_B or a numeric matrix or data frame of dimensions NBN_B x mBm_B, with mBm_B the number of auxiliary variables in frame B, containing auxiliary information for the units in frame B.

ind_samA

A numeric vector of length nAn_A containing the identificators of units of the frame A (from 1 to NAN_A) that belongs to sAs_A.

ind_samB

A numeric vector of length nBn_B containing the identificators of units of the frame B (from 1 to NBN_B) that belongs to sBs_B.

ind_domA

A character vector of length NAN_A indicating the domain each unit from frame A belongs to. Possible values are "a" and "ab".

ind_domB

A character vector of length NBN_B indicating the domain each unit from frame B belongs to. Possible values are "b" and "ba".

N

A numeric value indicating the size of the population.

conf_level

A numeric value indicating the confidence level for the confidence intervals.

sdA

(Optional) A character vector indicating the sampling design considered in frame A. Possible values are "srs" (simple random sampling without replacement), "pps" (probabilities proportional to size sampling), "str" (stratified sampling), "clu" (cluster sampling) and "strclu" (stratified cluster sampling). Default is "srs".

sdB

(Optional) A character vector indicating the sampling design considered in frame B. Possible values are "srs" (simple random sampling without replacement), "pps" (probabilities proportional to size sampling), "str" (stratified sampling), "clu" (cluster sampling) and "strclu" (stratified cluster sampling). Default is "srs".

strA

(Optional) A numeric vector indicating the stratum each unit in frame A belongs to, if a stratified sampling or a stratified cluster sampling has been considered in frame A.

strB

(Optional) A numeric vector indicating the stratum each unit in frame B belongs to, if a stratified sampling or a stratified cluster sampling has been considered in frame B.

clusA

(Optional) A numeric vector indicating the cluster each unit in frame A belongs to, if a cluster sampling or a stratified cluster sampling has been considered in frame A.

clusB

(Optional) A numeric vector indicating the cluster each unit in frame B belongs to, if a cluster sampling or a stratified cluster sampling has been considered in frame B.

fcpA

(Optional) A logic value indicating if a finite population correction factor should be considered in frame A. Default is FALSE.

fcpB

(Optional) A logic value indicating if a finite population correction factor should be considered in frame B. Default is FALSE.

Details

Let suppose a non stratified sampling design in frame A and a stratified sampling design in frame B where frame has been divided into L strata and a sample of size nBln_{Bl} from the NBlN_{Bl} composing the l-th stratum is selected In this context, jackknife variance estimator of a estimator Y^c\hat{Y}_c is given by

vJ(Y^c)=nA1nAisA(Y^cA(i)YcA)2+l=1LnBl1nBlisBl(Y^cB(lj)YcBl)2v_J(\hat{Y}_c) = \frac{n_{A}-1}{n_{A}}\sum_{i\in s_A} (\hat{Y}_{c}^{A}(i) -\overline{Y}_{c}^{A})^2 + \sum_{l=1}^{L}\frac{n_{Bl}-1}{n_{Bl}} \sum_{i\in s_{Bl}} (\hat{Y}_{c}^{B}(lj) -\overline{Y}_{c}^{Bl})^2

with Y^cA(i)\hat{Y}_c^A(i) the value of estimator Y^c\hat{Y}_c after dropping ithi-th unit from ysA and YcA\overline{Y}_{c}^{A} the mean of values Y^cA(i)\hat{Y}_c^A(i). Similarly, Y^cB(lj)\hat{Y}_c^B(lj) is the value taken by Y^c\hat{Y}_c after dropping j-th unit of l-th from sample ysB and YcBl\overline{Y}_{c}^{Bl} is the mean of values Y^cB(lj)\hat{Y}_c^B(lj). If needed, a finite population correction factor can be included in frames by replacing Y^cA(i)\hat{Y}_{c}^{A}(i) or Y^cB(lj)\hat{Y}_{c}^{B}(lj) with Y^cA(i)=Y^c+1πA(Y^cA(i)Y^c)\hat{Y}_{c}^{A*}(i)= \hat{Y}_{c}+\sqrt{1-\overline{\pi}_A} (\hat{Y}_{c}^{A}(i) -\hat{Y}_{c}) or Y^cB(lj)=Y^c+1πB(Y^cB(lj)Y^c)\hat{Y}_{c}^{B*}(lj)= \hat{Y}_{c}+\sqrt{1-\overline{\pi}_B} (\hat{Y}_{c}^{B}(lj) -\hat{Y}_{c}), where πA=isAπiA/nA\overline{\pi}_A = \sum_{i \in s_A}\pi_{iA}/nA and πB=jsBπjB/nB\overline{\pi}_B = \sum_{j \in s_B}\pi_{jB}/nB A confidence interval for any parameter of interest, YY can be calculated, then, using the pivotal method.

Value

A numeric matrix containing estimations of population total and population mean and their corresponding confidence intervals obtained through jackknife method.

References

Molina, D., Rueda, M., Arcos, A. and Ranalli, M. G. (2015) Multinomial logistic estimation in dual frame surveys Statistics and Operations Research Transactions (SORT). To be printed.

Wolter, K. M. (2007) Introduction to Variance Estimation. 2nd Edition. Springer, Inc., New York.

See Also

MLDF

Examples

data(DatMA)
data(DatMB)
data(DatPopM)

N <- nrow(DatPopM)
levels(DatPopM$Domain) <- c(levels(DatPopM$Domain), "ba")
DatPopMA <- subset(DatPopM, DatPopM$Domain == "a" | DatPopM$Domain == "ab", stringAsFactors = FALSE)
DatPopMB <- subset(DatPopM, DatPopM$Domain == "b" | DatPopM$Domain == "ab", stringAsFactors = FALSE)
DatPopMB[DatPopMB$Domain == "ab",]$Domain <- "ba"


#Let obtain a 95% jackknife confidence interval for variable Feeding,
#supposing a pps sampling in frame A and a simple random sampling
#without replacement in frame B with no finite population correction
#factor in any frame.
JackMLDF(DatMA$Prog, DatMB$Prog, DatMA$ProbA, DatMB$ProbB, DatMA$Domain, 
DatMB$Domain, DatMA$Read, DatMB$Read, DatPopMA$Read, DatPopMB$Read, 
DatMA$Id_Frame, DatMB$Id_Frame, DatPopMA$Domain, DatPopMB$Domain, N, 0.95, 
"pps", "srs")


[Package Frames2 version 0.2.1 Index]