R: Pseudo Maximum Likelihood estimator

PML {Frames2}

R Documentation

Pseudo Maximum Likelihood estimator

Description

Produces estimates for population totals and means using PML estimator from survey data obtained from a dual frame sampling design. Confidence intervals are also computed, if required.

Usage

PML(ysA, ysB, pi_A, pi_B, domains_A, domains_B, N_A, N_B, conf_level = NULL)

Arguments

`ysA`	A numeric vector of length `n_A` or a numeric matrix or data frame of dimensions `n_A` x `c` containing information about variable of interest from `s_A`.
`ysB`	A numeric vector of length `n_B` or a numeric matrix or data frame of dimensions `n_B` x `c` containing information about variable of interest from `s_B`.
`pi_A`	A numeric vector of length `n_A` or a square numeric matrix of dimension `n_A` containing first order or first and second order inclusion probabilities for units included in `s_A`.
`pi_B`	A numeric vector of length `n_B` or a square numeric matrix of dimension `n_B` containing first order or first and second order inclusion probabilities for units included in `s_B`.
`domains_A`	A character vector of size `n_A` indicating the domain each unit from `s_A` belongs to. Possible values are "a" and "ab".
`domains_B`	A character vector of size `n_B` indicating the domain each unit from `s_B` belongs to. Possible values are "b" and "ba".
`N_A`	A numeric value indicating the size of frame A
`N_B`	A numeric value indicating the size of frame B
`conf_level`	(Optional) A numeric value indicating the confidence level for the confidence intervals, if desired.

Details

Pseudo Maximum Likelihood estimator of population total is given by

\hat{Y}_{PML}(\hat{\theta}) = \frac{N_A - \hat{N}_{ab,PML}}{\hat{N}_a}\hat{Y}_a^A + \frac{N_B - \hat{N}_{ab,PML}}{\hat{N}_b}\hat{Y}_b^B + \frac{\hat{N}_{ab,PML}}{\hat{\theta}\hat{N}_{ab}^A + (1 - \hat{\theta})\hat{N}_{ab}^B}[\hat{\theta}\hat{Y}_{ab}^A + (1 - \hat{\theta})\hat{Y}_{ab}^B]

where \hat{\theta} \in [0, 1] and \hat{N}_{ab,PML} is the smaller of the roots of the quadratic equation

[\hat{\theta}/N_B + (1 - \hat{\theta})/N_A]x^2 - [1 + \hat{\theta}\hat{N}_{ab}^A/N_B + (1 - \hat{\theta})\hat{N}_{ab}^B/N_A]x + \hat{\theta}\hat{N}_{ab}^A + (1 - \hat{\theta})\hat{N}_{ab}^B=0.

Optimal value for \hat{\theta} is \frac{\hat{N}_aN_B\hat{V}(\hat{N}_{ab}^B)}{\hat{N}_aN_B\hat{V}(\hat{N}_{ab}^B) + \hat{N}_bN_A\hat{V}(\hat{N}_{ab}^A)}. Variance is estimated according to following expression

\hat{V}(\hat{Y}_{PML}(\hat{\theta})) = \hat{V}(\sum_{i \in s_A}\tilde{z}_i^A) + \hat{V}(\sum_{i \in s_B}\tilde{z}_i^B)

where, \tilde{z}_i^A = y_i - \frac{\hat{Y}_a}{\hat{N}_a} if i \in a and \tilde{z}_i^A = \hat{\gamma}_{opt}(y_i - \frac{\hat{Y}_a}{\hat{N}_a}) + \hat{\lambda} \hat{\phi} if i \in ab with

\hat{\gamma}_{opt} = \frac{\hat{N}_a N_B \hat{V}(\hat{N}_{ab}^B)}{\hat{N}_a N_B \hat{V}(\hat{N}_{ab}^B) + \hat{N}_b + N_A + \hat{V}(\hat{N}_{ab}^A)}

\hat{\lambda} = \frac{n_A/N_A \hat{Y}_{ab}^A + n_B/N_B \hat{Y}_{ab}^B}{n_A/N_A \hat{N}_{ab}^A + n_B/N_B \hat{N}_{ab}^B} - \frac{\hat{Y}_a}{\hat{N}_a} - \frac{\hat{Y}_b}{\hat{N}_b}

\hat{\phi} = \frac{n_A \hat{N}_b}{n_A \hat{N}_b + n_B\hat{N}_a}

Similarly, we define \tilde{z}_i^B = y_i - \frac{\hat{Y}_b}{\hat{N}_b} if i \in b and \tilde{z}_i^B = (1 - \hat{\gamma}_{opt})(y_i - \frac{\hat{Y}_{ba}}{\hat{N}_{ab}}) + \hat{\lambda}(1 - \hat{\phi}) if i \in ba

Value

PML returns an object of class "EstimatorDF" which is a list with, at least, the following components:

`Call`	the matched call.
`Est`	total and mean estimation for main variable(s).
`VarEst`	variance estimation for main variable(s).

If parameter conf_level is different from NULL, object includes component

ConfInt

total and mean estimation and confidence intervals for main variables(s).

In addition, components TotDomEst and MeanDomEst are available when estimator is based on estimators of the domains. Component Param shows value of parameters involded in calculation of the estimator (if any). By default, only Est component (or ConfInt component, if parameter conf_level is different from NULL) is shown. It is possible to access to all the components of the objects by using function summary.

References

Skinner, C. J. and Rao, J. N. K. (1996) Estimation in Dual Frame Surveys with Complex Designs. Journal of the American Statistical Association, Vol. 91, 433, 349 - 356.

Examples

data(DatA)
data(DatB)
data(PiklA)
data(PiklB)

#Let calculate Pseudo Maximum Likelihood estimator for population total for variable Clothing
PML(DatA$Clo, DatB$Clo, PiklA, PiklB, DatA$Domain, DatB$Domain, 
N_A = 1735, N_B = 1191)

#Now, let calculate Pseudo Maximum Likelihood estimator for population total for variable
#Feeding, using first order inclusion probabilities
PML(DatA$Feed, DatB$Feed, DatA$ProbA, DatB$ProbB, DatA$Domain, DatB$Domain, 
N_A = 1735, N_B = 1191)

#Finally, let calculate Pseudo Maximum Likelihood estimator and a 90% confidence interval for 
#population total for variable Leisure
PML(DatA$Lei, DatB$Lei, PiklA, PiklB, DatA$Domain, DatB$Domain, 
N_A = 1735, N_B = 1191, 0.90)

[Package Frames2 version 0.2.1 Index]