R: Variance estimation for sample surveys in domain by the two...

vardom_othstr {vardpoor}

R Documentation

Variance estimation for sample surveys in domain by the two stratification

Description

Computes the variance estimation for sample surveys in domain by the two stratification.

Usage

vardom_othstr(
  Y,
  H,
  H2,
  PSU,
  w_final,
  id = NULL,
  Dom = NULL,
  period = NULL,
  N_h = NULL,
  N_h2 = NULL,
  Z = NULL,
  X = NULL,
  ind_gr = NULL,
  g = NULL,
  q = NULL,
  dataset = NULL,
  confidence = 0.95,
  percentratio = 1,
  outp_lin = FALSE,
  outp_res = FALSE
)

Arguments

`Y`	Variables of interest. Object convertible to `data.table` or variable names as character, column numbers.
`H`	The unit stratum variable. One dimensional object convertible to one-column `data.table` or variable name as character, column number.
`H2`	The unit new stratum variable. One dimensional object convertible to one-column `data.table` or variable name as character, column number.
`PSU`	Primary sampling unit variable. One dimensional object convertible to one-column `data.table` or variable name as character, column number.
`w_final`	Weight variable. One dimensional object convertible to one-column `data.table` or variable name as character, column number.
`id`	Optional variable for unit ID codes. One dimensional object convertible to one-column `data.table` or variable name as character, column number.
`Dom`	Optional variables used to define population domains. If supplied, linearization of the at-risk-of-poverty rate is done for each domain. An object convertible to `data.table` or variable names as character vector, column numbers.
`period`	Optional variable for survey period. If supplied, residual estimation of calibration is done independently for each time period. One dimensional object convertible to one-column `data.table`.
`N_h`	optional data object convertible to `data.table`. If period is supplied, the time period is at the beginning of the object and after time period in the object is stratum. If period is not supplied, the first column in the object is stratum. In the last column is the total of the population in each stratum.
`N_h2`	optional data object convertible to `data.table`. If period is supplied, the time period is at the beginning of the object and after time period in the object is new stratum. If period is not supplied, the first column in the object is new stratum. In the last column is the total of the population in each stratum.
`Z`	optional variables of denominator for ratio estimation. Object convertible to `data.table` or variable names as character, column numbers.
`X`	Optional matrix of the auxiliary variables for the calibration estimator. Object convertible to `data.table` or variable names as character, column numbers.
`ind_gr`	Optional variable by which divided independently X matrix of the auxiliary variables for the calibration. One dimensional object convertible to one-column `data.table` or variable name as character, column number.
`g`	Optional variable of the g weights. One dimensional object convertible to one-column `data.table` or variable name as character, column number.
`q`	Variable of the positive values accounting for heteroscedasticity. One dimensional object convertible to one-column `data.table` or variable name as character, column number.
`dataset`	Optional survey data object convertible to `data.table`.
`confidence`	Optional positive value for confidence interval. This variable by default is 0.95.
`outp_lin`	Logical value. If `TRUE` linearized values of the ratio estimator will be printed out.
`outp_res`	Logical value. If `TRUE` estimated residuals of calibration will be printed out.
`percentratio`	Positive

numeric value. All linearized variables are multiplied with percentratio value, by default - 1.

Value

A list with objects are returned by the function:

lin_out - a data.table containing the linearized values of the ratio estimator with id and PSU.
res_out - a data.table containing the estimated residuals of calibration with id and PSU.
betas - a numeric data.table containing the estimated coefficients of calibration.
s2g - a data.table containing the s^2g value.
all_result - a data.table, which containing variables:
respondent_count - the count of respondents,
pop_size - the estimated size of population,
n_nonzero - the count of respondents, who answers are larger than zero,
estim - the estimated value,
var - the estimated variance,
se - the estimated standard error,
rse - the estimated relative standard error (coefficient of variation),
cv - the estimated relative standard error (coefficient of variation) in percentage,
absolute_margin_of_error - the estimated absolute margin of error,
relative_margin_of_error - the estimated relative margin of error in percentage,
CI_lower - the estimated confidence interval lower bound,
CI_upper - the estimated confidence interval upper bound,
confidence_level - the positive value for confidence interval,
var_srs_HT - the estimated variance of the HT estimator under SRS,
var_cur_HT - the estimated variance of the HT estimator under current design,
var_srs_ca - the estimated variance of the calibrated estimator under SRS,
deff_sam - the estimated design effect of sample design,
deff_est - the estimated design effect of estimator,
deff - the overall estimated design effect of sample design and estimator.

References

Jean-Claude Deville (1999). Variance estimation for complex statistics and estimators: linearization and residual techniques. Survey Methodology, 25, 193-203, URL https://www150.statcan.gc.ca/n1/pub/12-001-x/1999002/article/4882-eng.pdf.
M. Liberts. (2004) Non-response Analysis and Bias Estimation in a Survey on Transportation of Goods by Road.

Examples

library("laeken")
library("data.table")
data("eusilc")
  
# Example 1
eusilc1 <- eusilc[1:1000, ]
dataset1 <- data.table(IDd = paste0("V", 1:nrow(eusilc1)), eusilc1)
dataset1[, db040_2 := get("db040")]
N_h2 <- dataset1[, sum(rb050, na.rm = FALSE), keyby = "db040_2"]
  
aa <- vardom_othstr(Y = "eqIncome", H = "db040", H2 = "db040_2",  
                    PSU = "db030", w_final = "rb050", id = "rb030",
                    Dom = "db040", period = NULL, N_h = NULL,
                    N_h2 = N_h2, Z = NULL, X = NULL, g = NULL,
                    q = NULL, dataset = dataset1, confidence = .95,           
                    outp_lin = TRUE, outp_res = TRUE)
  
## Not run: 
# Example 2
dataset1 <- data.table(IDd = 1:nrow(eusilc), eusilc)
dataset1[, db040_2 := get("db040")]
N_h2 <- dataset1[, sum(rb050, na.rm = FALSE), keyby = "db040_2"]
    
aa <- vardom_othstr(Y = "eqIncome", H = "db040", H2 = "db040_2",
                    PSU = "db030", w_final = "rb050", id = "rb030",
                    Dom = "db040", period = NULL, N_h2 = N_h2,
                    Z = NULL, X = NULL, g = NULL, dataset = dataset1,
                    q = NULL, confidence = .95, outp_lin = TRUE,
                    outp_res = TRUE)
 aa
## End(Not run)

[Package vardpoor version 0.20.1 Index]