JackCalDF {Frames2}R Documentation

Confidence intervals for dual frame calibration estimator based on jackknife method

Description

Calculates confidence intervals for dual frame calibration estimator using jackknife procedure

Usage

JackCalDF(ysA, ysB, piA, piB, domainsA, domainsB, N_A = NULL, N_B = NULL, 
N_ab = NULL, xsAFrameA = NULL, xsBFrameA = NULL, xsAFrameB = NULL, 
xsBFrameB = NULL, xsT = NULL, XA = NULL, XB = NULL, X = NULL, met = "linear", 
conf_level, sdA = "srs", sdB = "srs", strA = NULL, strB = NULL, clusA = NULL,
clusB = NULL, fcpA = FALSE, fcpB = FALSE)

Arguments

ysA

A numeric vector of length nA or a numeric matrix or data frame of dimensions nA x c containing information about variable of interest from s_A.

ysB

A numeric vector of length nB or a numeric matrix or data frame of dimensions nB x c containing information about variable of interest from s_B.

piA

A numeric vector of length nA or a square numeric matrix of dimension nA containing first order or first and second order inclusion probabilities for units included in s_A.

piB

A numeric vector of length nB or a square numeric matrix of dimension nB containing first order or first and second order inclusion probabilities for units included in s_B.

domainsA

A character vector of size nA indicating the domain each unit from s_A belongs to. Possible values are "a" and "ab".

domainsB

A character vector of size nB indicating the domain each unit from s_B belongs to. Possible values are "b" and "ba".

N_A

(Optional) A numeric value indicating the size of frame A

N_B

(Optional) A numeric value indicating the size of frame B

N_ab

(Optional) A numeric value indicating the size of the overlap domain

xsAFrameA

(Optional) A numeric vector of length nA or a numeric matrix or data frame of dimensions nA x m_A, with m_A the number of auxiliary variables in frame A, containing auxiliary information in frame A for units included in s_A.

xsBFrameA

(Optional) A numeric vector of length nB or a numeric matrix or data frame of dimensions nB x m_A, with m_A the number of auxiliary variables in frame A, containing auxiliary information in frame A for units included in s_B. For units in domain b, these values are 0.

xsAFrameB

(Optional) A numeric vector of length nA or a numeric matrix or data frame of dimensions nA x m_B, with m_B the number of auxiliary variables in frame B, containing auxiliary information in frame B for units included in s_A. For units in domain a, these values are 0.

xsBFrameB

(Optional) A numeric vector of length nB or a numeric matrix or data frame of dimensions nB x m_B, with m_B the number of auxiliary variables in frame B, containing auxiliary information in frame B for units included in s_B.

xsT

(Optional) A numeric vector of length n or a numeric matrix or data frame of dimensions n x m_T, with m_T the number of auxiliary variables in both frames, containing auxiliary information for all units in the entire sample s = s_A \cup s_B.

XA

(Optional) A numeric value or vector of length m_A, with m_A the number of auxiliary variables in frame A, indicating the population totals for the auxiliary variables considered in frame A.

XB

(Optional) A numeric value or vector of length m_B, with m_B the number of auxiliary variables in frame B, indicating the population totals for the auxiliary variables considered in frame B.

X

(Optional) A numeric value or vector of length m_T, with m_T the number of auxiliary variables in both frames, indicating the population totals for the auxiliary variables considered in both frames.

met

(Optional) A character vector indicating the distance that must be used in calibration process. Possible values are "linear", "raking" and "logit". Default is "linear".

conf_level

A numeric value indicating the confidence level for the confidence intervals.

sdA

(Optional) A character vector indicating the sampling design considered in frame A. Possible values are "srs" (simple random sampling without replacement), "pps" (probabilities proportional to size sampling), "str" (stratified sampling), "clu" (cluster sampling) and "strclu" (stratified cluster sampling). Default is "srs".

sdB

(Optional) A character vector indicating the sampling design considered in frame B. Possible values are "srs" (simple random sampling without replacement), "pps" (probabilities proportional to size sampling), "str" (stratified sampling), "clu" (cluster sampling) and "strclu" (stratified cluster sampling). Default is "srs".

strA

(Optional) A numeric vector indicating the stratum each unit in frame A belongs to, if a stratified sampling or a stratified cluster sampling has been considered in frame A.

strB

(Optional) A numeric vector indicating the stratum each unit in frame B belongs to, if a stratified sampling or a stratified cluster sampling has been considered in frame B.

clusA

(Optional) A numeric vector indicating the cluster each unit in frame A belongs to, if a cluster sampling or a stratified cluster sampling has been considered in frame A.

clusB

(Optional) A numeric vector indicating the cluster each unit in frame B belongs to, if a cluster sampling or a stratified cluster sampling has been considered in frame B.

fcpA

(Optional) A logic value indicating if a finite population correction factor should be considered in frame A. Default is FALSE.

fcpB

(Optional) A logic value indicating if a finite population correction factor should be considered in frame B. Default is FALSE.

Details

Let suppose a non stratified sampling design in frame A and a stratified sampling design in frame B where frame has been divided into L strata and a sample of size n_{Bl} from the N_{Bl} composing the l-th stratum is selected In this context, jackknife variance estimator of a estimator \hat{Y}_c is given by

v_J(\hat{Y}_c) = \frac{n_{A}-1}{n_{A}}\sum_{i\in s_A} (\hat{Y}_{c}^{A}(i) -\overline{Y}_{c}^{A})^2 + \sum_{l=1}^{L}\frac{n_{Bl}-1}{n_{Bl}} \sum_{i\in s_{Bl}} (\hat{Y}_{c}^{B}(lj) -\overline{Y}_{c}^{Bl})^2

with \hat{Y}_c^A(i) the value of estimator \hat{Y}_c after dropping i-th unit from ysA and \overline{Y}_{c}^{A} the mean of values \hat{Y}_c^A(i). Similarly, \hat{Y}_c^B(lj) is the value taken by \hat{Y}_c after dropping j-th unit of l-th from sample ysB and \overline{Y}_{c}^{Bl} is the mean of values \hat{Y}_c^B(lj). If needed, a finite population correction factor can be included in frames by replacing \hat{Y}_{c}^{A}(i) or \hat{Y}_{c}^{B}(lj) with \hat{Y}_{c}^{A*}(i)= \hat{Y}_{c}+\sqrt{1-\overline{\pi}_A} (\hat{Y}_{c}^{A}(i) -\hat{Y}_{c}) or \hat{Y}_{c}^{B*}(lj)= \hat{Y}_{c}+\sqrt{1-\overline{\pi}_B} (\hat{Y}_{c}^{B}(lj) -\hat{Y}_{c}), where \overline{\pi}_A = \sum_{i \in s_A}\pi_{iA}/nA and \overline{\pi}_B = \sum_{j \in s_A}\pi_{jB}/nB A confidence interval for any parameter of interest, Y can be calculated, then, using the pivotal method.

Value

A numeric matrix containing estimations of population total and population mean and their corresponding confidence intervals obtained through jackknife method.

References

Wolter, K. M. (2007) Introduction to Variance Estimation. 2nd Edition. Springer, Inc., New York.

See Also

CalDF

Examples

data(DatA)
data(DatB)

#Let obtain a 95% jackknife confidence interval for variable Clothing,
#with frame sizes and overlap domain size known, supposing a stratified
#sampling in frame A and a simple random sampling without replacement 
#in frame B with no finite population correction factor in any frame.
JackCalDF(DatA$Clo, DatB$Clo, DatA$ProbA, DatB$ProbB, 
DatA$Domain, DatB$Domain, N_A = 1735, N_B = 1191, N_ab = 601, conf_level = 0.95,
sdA = "str", sdB = "srs", strA = DatA$Stratum)

#Finally, let consider a finite population correction factor in both frames.
JackCalDF(DatA$Clo, DatB$Clo, DatA$ProbA, DatB$ProbB, 
DatA$Domain, DatB$Domain, N_A = 1735, N_B = 1191, N_ab = 601, conf_level = 0.95,
sdA = "str", sdB = "srs", strA = DatA$Stratum, fcpA = TRUE, fcpB = TRUE)

[Package Frames2 version 0.2.1 Index]