R: The Narain-Horvitz-Thompson estimator for the empirical...

Est.EmpDistFunc.NHT {samplingVarEst}

R Documentation

The Narain-Horvitz-Thompson estimator for the empirical cumulative distribution function

Description

Computes the Narain (1951); Horvitz-Thompson (1952) estimator for the empirical cumulative distribution function (ECDF).

Usage

Est.EmpDistFunc.NHT(VecY.s, VecPk.s, N, t)

Arguments

`VecY.s`	vector of the variable of interest; its length is equal to `n`, the sample size. Its length has to be the same as that of `VecPk.s`. There must not be missing values.
`VecPk.s`	vector of the first-order inclusion probabilities; its length is equal to `n`, the sample size. Values in `VecPk.s` must be greater than zero and less than or equal to one. There must not be missing values.
`N`	the population size. It must be an integer or a double-precision scalar with zero-valued fractional part.
`t`	value to be evaluated for the empirical cumulative distribution function. It must be an integer or a double-precision scalar.

Details

For the population empirical cumulative distribution function (ECDF) of the variable y at the value t:

Fn(t) = \frac{\#(k\in U:y_k \leq t)}{N} = \frac{1}{N} \sum_{k\in U} I(y_k \leq t)

the unbiased Narain (1951); Horvitz-Thompson (1952) estimator of Fn(t) (implemented by the current function) is given by:

\hat{F}n_{NHT}(t) = \frac{1}{N} \sum_{k\in s} \frac{I(y_k \leq t)}{\pi_k}

where I(y_k \leq t) denotes the indicator function that takes the value 1 if y_k \leq t and the value 0 otherwise, and where \pi_k denotes the inclusion probability of the k-th element in the sample s.

Value

The function returns a value for the empirical cumulative distribution function evaluated at t.

Author(s)

Emilio Lopez Escobar [aut, cre], Juan Francisco Munoz Rosas [ctb].

References

Horvitz, D. G. and Thompson, D. J. (1952) A generalization of sampling without replacement from a finite universe. Journal of the American Statistical Association, 47, 663–685.

Narain, R. D. (1951) On sampling without replacement with varying probabilities. Journal of the Indian Society of Agricultural Statistics, 3, 169–175.

Examples

data(oaxaca)                                       #Loads Oaxaca municipalities dataset
pik.U <- Pk.PropNorm.U(373, oaxaca$HOMES00)        #Reconstructs the inclusion probs.
s     <- oaxaca$sHOMES00                           #Defines the sample to be used
N     <- dim(oaxaca)[1]                            #Defines the population size
y1    <- oaxaca$POP10                              #Defines the variable of interest y1
Est.EmpDistFunc.NHT(y1[s==1], pik.U[s==1], N, 950) #NHT est. of ECDF for y1 at t=950

[Package samplingVarEst version 1.5 Index]