R: Compute the unit (population) variance for a variable

unitVar {PracTools}

R Documentation

Compute the unit (population) variance for a variable

Description

Compute the unit (population) variance for a variable based on either a full population file or a sample from a finite population.

Usage

unitVar(pop.sw = NULL, w = NULL, p = NULL, y = NULL)

Arguments

`pop.sw`	TRUE if the full population is input; FALSE if a sample is input
`w`	vector of sample weights if `y` is a sample; used only if `pop.sw = FALSE`
`p`	vector of 1-draw selection probabilities; optionally provided if `pop.sw = TRUE`
`y`	vector of values of an analysis variable; must be numeric

Details

unitVar computes unit (population) variances of an analysis variable y from either a population or a sample. S2 is the unweighted population variance, S^2 = \sum_{i \in U}(y_i - \bar{y}_U)^2/(N-1) where U is the universe of elements, N is the population size, and \bar{y}_U is the population mean. If the input is a sample, S2 is estimated as \hat{S}^2 = (n/(n-1))\sum_{i \in s} w_i(y_i - \bar{y}_w)^2/(\sum_{i \in s} w_i) where s is the set of sample elements, n is the sample size, and \bar{y}_w is the weighted sample mean.

V1 is a weighted population variance used in calculations for samples where elements are selected with varying probabilities. If the y is a population vector, V_1 = \sum_U p_i(y_i/p_i - t_U)^2 where p_i is the 1-draw probability for element i and t_U is the population total of y. If y is for a sample, \hat{V}_1 = \sum_s (y_i/p_i - n^{-1}\sum_k y_k/p_k)^2 / (n-1) with p_i computed as 1/(n w_i).

Value

A list with three or four components:

`Note`	Describes whether output was computed from a full population or estimated from a sample.
`Pop size N`	Size of the population; included if `y` is for the full population.
`S2`	Unit variance of `y`; if `pop.sw = TRUE`, `S2` is computed from the full population; if `pop.sw = FALSE`, `S2` is estimated from the sample using the `w` weights.
`V1`	Population variance of `y` appropriate for a sample selected with varying probabilities; see Valliant, Dever, and Kreuter (VDK; 2018, sec. 3.4). If `pop.sw = TRUE` and `p` is provided, `V1` is computed with equation (3.32) in VDK. If `pop.sw = FALSE`, `V1` is estimated with equation (3.41) in VDK.

Author(s)

Richard Valliant

References

Valliant, R., Dever, J., Kreuter, F. (2018, chap. 3). Practical Tools for Designing and Weighting Survey Samples, 2nd edition. New York: Springer.

Examples

library(PracTools)
data("smho.N874")
y <- smho.N874[,"EXPTOTAL"]
x <- smho.N874[, "BEDS"]
y <- y[x>0]
x <- x[x>0]
pik <- x/sum(x)
require(sampling)
n <- 50
sam <- UPrandomsystematic(n * pik)
wts <- 1/(n*pik[sam==1])
unitVar(pop.sw = TRUE, w = NULL, p = pik, y=y)
unitVar(pop.sw = FALSE, w = wts, p = NULL, y=y[sam==1])

[Package PracTools version 1.5 Index]