R: Variance estimation for measures of change for single and...

vardchanges {vardpoor}

R Documentation

Variance estimation for measures of change for single and multistage stage cluster sampling designs

Description

Computes the variance estimation for measures of change for single and multistage stage cluster sampling designs.

Usage

vardchanges(
  Y,
  H,
  PSU,
  w_final,
  ID_level1,
  ID_level2,
  Dom = NULL,
  Z = NULL,
  gender = NULL,
  country = NULL,
  period,
  dataset = NULL,
  period1,
  period2,
  X = NULL,
  countryX = NULL,
  periodX = NULL,
  X_ID_level1 = NULL,
  ind_gr = NULL,
  g = NULL,
  q = NULL,
  datasetX = NULL,
  linratio = FALSE,
  percentratio = 1,
  use.estVar = FALSE,
  outp_res = FALSE,
  confidence = 0.95,
  change_type = "absolute",
  checking = TRUE
)

Arguments

`Y`	Variables of interest. Object convertible to `data.table` or variable names as character, column numbers.
`H`	The unit stratum variable. One dimensional object convertible to one-column `data.table` or variable name as character, column number.
`PSU`	Primary sampling unit variable. One dimensional object convertible to one-column `data.table` or variable name as character, column number.
`w_final`	Weight variable. One dimensional object convertible to one-column `data.table` or variable name as character, column number.
`ID_level1`	Variable for level1 ID codes. One dimensional object convertible to one-column `data.table` or variable name as character, column number.
`ID_level2`	Optional variable for unit ID codes. One dimensional object convertible to one-column `data.table` or variable name as character, column number.
`Dom`	Optional variables used to define population domains. If supplied, variables are calculated for each domain. An object convertible to `data.table` or variable names as character vector, column numbers.
`Z`	Optional variables of denominator for ratio estimation. If supplied, the ratio estimation is computed. Object convertible to `data.table` or variable names as character, column numbers. This variable is `NULL` by default.
`gender`	Numerical variable for gender, where 1 is for males, but 2 is for females. One dimensional object convertible to one-column `data.table` or variable name as character, column number.
`country`	Variable for the survey countries. The values for each country are computed independently. Object convertible to `data.table` or variable names as character, column numbers.
`period`	Variable for the all survey periods. The values for each period are computed independently. Object convertible to `data.table` or variable names as character, column numbers.
`dataset`	Optional survey data object convertible to `data.table`.
`period1`	The vector of periods from variable `periods` describes the first period.
`period2`	The vector of periods from variable `periods` describes the second period.
`X`	Optional matrix of the auxiliary variables for the calibration estimator. Object convertible to `data.table` or variable names as character, column numbers.
`countryX`	Optional variable for the survey countries. The values for each country are computed independently. Object convertible to `data.table` or variable names as character, column numbers.
`periodX`	Optional variable of the all survey periods. If supplied, residual estimation of calibration is done independently for each time period. Object convertible to `data.table` or variable names as character, column numbers.
`X_ID_level1`	Variable for level1 ID codes. One dimensional object convertible to one-column `data.table` or variable name as character, column number.
`ind_gr`	Optional variable by which divided independently X matrix of the auxiliary variables for the calibration. One dimensional object convertible to one-column `data.table` or variable name as character, column number.
`g`	Optional variable of the g weights. One dimensional object convertible to one-column `data.table` or variable name as character, column number.
`q`	Variable of the positive values accounting for heteroscedasticity. One dimensional object convertible to one-column `data.table` or variable name as character, column number.
`datasetX`	Optional survey data object in household level convertible to `data.table`.
`linratio`	Logical value. If value is `TRUE`, then the linearized variables for the ratio estimator is used for variance estimation. If value is `FALSE`, then the gradients is used for variance estimation.
`percentratio`	Positive numeric value. All linearized variables are multiplied with `percentratio` value, by default - 1.
`use.estVar`	Logical value. If value is `TRUE`, then `R` function `estVar` is used for the estimation of covariance matrix of the residuals. If value is `FALSE`, then `R` function `estVar` is not used for the estimation of covariance matrix of the residuals.
`outp_res`	Logical value. If `TRUE` estimated residuals of calibration will be printed out.
`confidence`	optional; either a positive value for confidence interval. This variable by default is 0.95 .
`change_type`	character value net changes type - absolute or relative.
`checking`	Optional variable if this variable is TRUE, then function checks data preparation errors, otherwise not checked. This variable by default is TRUE.

Value

A list with objects are returned by the function:

res_out - a data.table containing the estimated residuals of calibration with ID_level1 and PSU by periods and countries (if available). #'
crossectional_results - a data.table containing:
period - survey periods,
country - survey countries,
Dom - optional variable of the population domains,
namesY - variable with names of variables of interest,
namesZ - optional variable with names of denominator for ratio estimation,
sample_size - the sample size (in numbers of individuals),
pop_size - the population size (in numbers of individuals),
total - the estimated totals,
variance - the estimated variance of cross-sectional or longitudinal measures,
sd_w - the estimated weighted variance of simple random sample,
sd_nw - the estimated variance estimation of simple random sample,
pop - the population size (in numbers of households),
sampl_siz - the sample size (in numbers of households),
stderr_w - the estimated weighted standard error of simple random sample,
stderr_nw - the estimated standard error of simple random sample,
se - the estimated standard error of cross-sectional or longitudinal,
rse - the estimated relative standard error (coefficient of variation),
cv - the estimated relative standard error (coefficient of variation) in percentage,
absolute_margin_of_error - the estimated absolute margin of error,
relative_margin_of_error - the estimated relative margin of error,
CI_lower - the estimated confidence interval lower bound,
CI_upper - the estimated confidence interval upper bound. #'
crossectional_var_grad - a data.table containing:
periods - survey periods,
country - survey countries,
Dom - optional variable of the population domains,
namesY - variable with names of variables of interest,
namesZ - optional variable with names of denominator for ratio estimation,
grad - the estimated gradient,
var - the estimated a design-based variance.
rho - a data.table containing:
periods_1 - survey periods of periods1,
periods_2 - survey periods of periods2,
country - survey countries,
Dom - optional variable of the population domains,
namesY - variable with names of variables of interest,
namesZ - optional variable with names of denominator for ratio estimation,
nams - the variable names in correlation matrix,
rho - the estimated correlation matrix.
var_tau - a data.table containing:
periods_1 - survey periods of periods1,
periods_2 - survey periods of periods2,
country - survey countries,
Dom - optional variable of the population domains,
namesY - variable with names of variables of interest,
namesZ - optional variable with names of denominator for ratio estimation,
nams - the variable names in correlation matrix,
var_tau - the estimated covariance matrix.
changes_results - a data.table containing:
periods_1 - survey periods of periods1,
periods_2 - survey periods of periods2,
country - survey countries,
Dom - optional variable of the population domains,
namesY - variable with names of variables of interest,
namesZ - optional variable with names of denominator for ratio estimation,
estim_1 - the estimated value for period1,
estim_2 - the estimated value for period2,
estim - the estimated value,
var - the estimated variance,
se - the estimated standard error,
CI_lower - the estimated confidence interval lower bound,
CI_upper - the estimated confidence interval upper bound.
significant - is the the difference significant.

References

Guillaume Osier, Yves Berger, Tim Goedeme, (2013), Standard error estimation for the EU-SILC indicators of poverty and social exclusion, Eurostat Methodologies and Working papers, URL http://ec.europa.eu/eurostat/documents/3888793/5855973/KS-RA-13-024-EN.PDF.
Eurostat Methodologies and Working papers, Handbook on precision requirements and variance estimation for ESS household surveys, 2013, URL http://ec.europa.eu/eurostat/documents/3859598/5927001/KS-RA-13-029-EN.PDF.
Yves G. Berger, Tim Goedeme, Guillame Osier (2013). Handbook on standard error estimation and other related sampling issues in EU-SILC, URL https://ec.europa.eu/eurostat/cros/content/handbook-standard-error-estimation-and-other-related-sampling-issues-ver-29072013_en

Examples


### Example 
library("data.table")
library("laeken")
data("eusilc")
set.seed(1)
eusilc1 <- eusilc[1:40,]
set.seed(1)
dataset1 <- data.table(rbind(eusilc1, eusilc1),
                       year = c(rep(2010, nrow(eusilc1)),
                                rep(2011, nrow(eusilc1))))
dataset1[age < 0, age := 0]
PSU <- dataset1[, .N, keyby = "db030"][, N := NULL]
PSU[, PSU := trunc(runif(nrow(PSU), 0, 5))]
dataset1 <- merge(dataset1, PSU, all = TRUE, by = "db030")
PSU <- eusilc <- NULL
dataset1[, strata := c("XXXX")]

dataset1[, t_pov := trunc(runif(nrow(dataset1), 0, 2))]
dataset1[, exp := 1]

# At-risk-of-poverty (AROP)
dataset1[, pov := ifelse (t_pov == 1, 1, 0)]
dataset1[, id_lev2 := paste0("V", .I)]


result <- vardchanges(Y = "pov", H = "strata", 
                      PSU = "PSU", w_final = "rb050",
                      ID_level1 = "db030", ID_level2 = "id_lev2",
                      Dom = NULL, Z = NULL, period = "year",
                      dataset = dataset1, period1 = 2010,
                      period2 = 2011, change_type = "absolute")
result

## Not run: 
data("eusilc")
dataset1 <- data.table(rbind(eusilc, eusilc),
                       year = c(rep(2010, nrow(eusilc)),
                                rep(2011, nrow(eusilc))))
dataset1[age < 0, age := 0]
PSU <- dataset1[,.N, keyby = "db030"][, N := NULL]
PSU[, PSU := trunc(runif(nrow(PSU), 0, 100))]
dataset1 <- merge(dataset1, PSU, all = TRUE, by = "db030")
PSU <- eusilc <- NULL
dataset1[, strata := "XXXX"]
  
dataset1[, t_pov := trunc(runif(nrow(dataset1), 0, 2))]
dataset1[, t_dep := trunc(runif(nrow(dataset1), 0, 2))]
dataset1[, t_lwi := trunc(runif(nrow(dataset1), 0, 2))]
dataset1[, exp := 1]
dataset1[, exp2 := 1 * (age < 60)]
  
# At-risk-of-poverty (AROP)
dataset1[, pov := ifelse (t_pov == 1, 1, 0)]
  
# Severe material deprivation (DEP)
dataset1[, dep := ifelse (t_dep == 1, 1, 0)]
  
# Low work intensity (LWI)
dataset1[, lwi := ifelse (t_lwi == 1 & exp2 == 1, 1, 0)]
  
# At-risk-of-poverty or social exclusion (AROPE)
dataset1[, arope := ifelse (pov == 1 | dep == 1 | lwi == 1, 1, 0)]
dataset1[, dom := 1]
dataset1[, id_lev2 := .I]
  
result <- vardchanges(Y = c("pov", "dep", "lwi", "arope"),
                      H = "strata", PSU = "PSU", w_final = "rb050",
                      ID_level1 = "db030", ID_level2 = "id_lev2",
                      Dom = "rb090", Z = NULL, period = "year",
                      dataset = dataset1, period1 = 2010, 
                      period2 = 2011, change_type = "absolute")
result
## End(Not run)

[Package vardpoor version 0.20.1 Index]