varpoord {vardpoor} | R Documentation |
Estimation of the variance and deff for sample surveys for indicators on social exclusion and poverty
Description
Computes the estimation of the variance for indicators on social exclusion and poverty.
Usage
varpoord(
Y,
w_final,
age = NULL,
pl085 = NULL,
month_at_work = NULL,
Y_den = NULL,
Y_thres = NULL,
wght_thres = NULL,
ID_level1,
ID_level2 = NULL,
H,
PSU,
N_h,
PSU_sort = NULL,
fh_zero = FALSE,
PSU_level = TRUE,
sort = NULL,
Dom = NULL,
period = NULL,
gender = NULL,
dataset = NULL,
X = NULL,
periodX = NULL,
X_ID_level1 = NULL,
ind_gr = NULL,
g = NULL,
q = NULL,
datasetX = NULL,
percentage = 60,
order_quant = 50,
alpha = 20,
confidence = 0.95,
outp_lin = FALSE,
outp_res = FALSE,
type = "linrmpg"
)
Arguments
Y |
Study variable (for example equalized disposable income or gross pension income). One dimensional object convertible to one-column |
w_final |
Weight variable. One dimensional object convertible to one-column |
age |
Age variable. One dimensional object convertible to one-column |
pl085 |
Retirement variable (Number of months spent in retirement or early retirement). One dimensional object convertible to one-column |
Y_den |
Denominator variable (for example gross individual earnings). One dimensional object convertible to one-column |
Y_thres |
Variable (for example equalized disposable income) used for computation and linearization of poverty threshold. One dimensional object convertible to one-column |
wght_thres |
Weight variable used for computation and linearization of poverty threshold. One dimensional object convertible to one-column |
ID_level1 |
Variable for level1 ID codes. One dimensional object convertible to one-column |
ID_level2 |
Optional variable for unit ID codes. One dimensional object convertible to one-column |
H |
The unit stratum variable. One dimensional object convertible to one-column |
PSU |
Primary sampling unit variable. One dimensional object convertible to one-column |
N_h |
Number of primary sampling units in population for each stratum (and period if |
PSU_sort |
optional; if PSU_sort is defined, then variance is calculated for systematic sample. |
fh_zero |
by default FALSE; |
PSU_level |
by default TRUE; if PSU_level is TRUE, in each strata |
sort |
Optional variable to be used as tie-breaker for sorting. One dimensional object convertible to one-column |
Dom |
Optional variables used to define population domains. If supplied, variables is calculated for each domain. An object convertible to |
period |
Optional variable for survey period. If supplied, variables is calculated for each time period. Object convertible to |
gender |
Numerical variable for gender, where 1 is for males, but 2 is for females. One dimensional object convertible to one-column |
dataset |
Optional survey data object convertible to |
X |
Optional matrix of the auxiliary variables for the calibration estimator. Object convertible to |
periodX |
Optional variable of the survey periods. If supplied, residual estimation of calibration is done independently for each time period. Object convertible to |
X_ID_level1 |
Variable for level1 ID codes. One dimensional object convertible to one-column |
ind_gr |
Optional variable by which divided independently X matrix of the auxiliary variables for the calibration. One dimensional object convertible to one-column |
g |
Optional variable of the g weights. One dimensional object convertible to one-column |
q |
Variable of the positive values accounting for heteroscedasticity. One dimensional object convertible to one-column |
datasetX |
Optional survey data object in household level convertible to |
percentage |
A numeric value in range
For example, to compute poverty threshold equal to 60% of some income quantile, |
order_quant |
A numeric value in range
For example, to compute poverty threshold equal to some percentage of median income, |
alpha |
a numeric value in range |
confidence |
Optional positive value for confidence interval. This variable by default is 0.95. |
outp_lin |
Logical value. If |
outp_res |
Logical value. If |
type |
a character vector (of length one unless several.ok is TRUE), example "linarpr","linarpt", "lingpg", "linpoormed", "linrmpg", "lingini", "lingini2", "linqsr", "linarr", "linrmir". |
month_at_work |
Variable |
for total number of month at work (sum of the number of months spent at full-time work as employee, number of months spent at part-time work as employee, number of months spent at full-time work as self-employed (including family worker), number of months spent at part-time work as self-employed (including family worker)). One dimensional object convertible to one-column data.table
or variable name as character, column number.
Value
A list with objects are returned by the function:
-
lin_out
- adata.table
containing the linearized values of the ratio estimator with ID_level2 and PSU. -
res_out
- adata.table
containing the estimated residuals of calibration with ID_level1 and PSU. -
betas
- a numericdata.table
containing the estimated coefficients of calibration. -
all_result
- adata.table
, which containing variables:
respondent_count
- the count of respondents,
pop_size
- the estimated size of population,
n_nonzero
- the count of respondents, who answers are larger than zero,
value
- the estimated value,
var
- the estimated variance,
se
- the estimated standard error,
rse
- the estimated relative standard error (coefficient of variation),
cv
- the estimated relative standard error (coefficient of variation) in percentage,
absolute_margin_of_error
- the estimated absolute margin of error,
relative_margin_of_error
- the estimated relative margin of error in percentage,
CI_lower
- the estimated confidence interval lower bound,
CI_upper
- the estimated confidence interval upper bound,
confidence_level
- the positive value for confidence interval,
S2_y_HT
- the estimated variance of the y variable in case of total or the estimated variance of the linearised variable in case of the ratio of two totals using non-calibrated weights,
S2_y_ca
- the estimated variance of the y variable in case of total or the estimated variance of the linearised variable in case of the ratio of two totals using calibrated weights,
S2_res
- the estimated variance of the regression residuals,
var_srs_HT
- the estimated variance of the HT estimator under SRS for household,
var_cur_HT
- the estimated variance of the HT estimator under current design for household,
var_srs_ca
- the estimated variance of the calibrated estimator under SRS for household,
deff_sam
- the estimated design effect of sample design for household,
deff_est
- the estimated design effect of estimator for household,
deff
- the overall estimated design effect of sample design and estimator for household
References
Eric Graf and Yves Tille, Variance Estimation Using Linearization for Poverty and Social Exclusion Indicators, Survey Methodology, June 2014 61 Vol. 40, No. 1, pp. 61-79, Statistics Canada, Catalogue no. 12-001-X, URL https://www150.statcan.gc.ca/n1/pub/12-001-x/12-001-x2014001-eng.pdf
Guillaume Osier and Emilio Di Meglio. The linearisation approach implemented by Eurostat for the first wave of EU-SILC: what could be done from the second wave onwards? 2012
Guillaume Osier (2009). Variance estimation for complex indicators of poverty and inequality. Journal of the European Survey Research Association, Vol.3, No.3, pp. 167-195, ISSN 1864-3361, URL https://ojs.ub.uni-konstanz.de/srm/article/view/369.
Eurostat Methodologies and Working papers, Standard error estimation for the EU-SILC indicators of poverty and social exclusion, 2013, URL http://ec.europa.eu/eurostat/documents/3859598/5927001/KS-RA-13-029-EN.PDF.
Jean-Claude Deville (1999). Variance estimation for complex statistics and estimators: linearization and residual techniques. Survey Methodology, 25, 193-203, URL https://www150.statcan.gc.ca/n1/pub/12-001-x/1999002/article/4882-eng.pdf.
Eurostat Methodologies and Working papers, Handbook on precision requirements and variance estimation for ESS household surveys, 2013, URL http://ec.europa.eu/eurostat/documents/3859598/5927001/KS-RA-13-029-EN.PDF.
Matti Langel, Yves Tille, Corrado Gini, a pioneer in balanced sampling and inequality theory. Metron - International Journal of Statistics, 2011, vol. LXIX, n. 1, pp. 45-65, URL http://dx.doi.org/10.1007/BF03263549.
Morris H. Hansen, William N. Hurwitz, William G. Madow, (1953), Sample survey methods and theory Volume I Methods and applications, 257-258, Wiley.
Yves G. Berger, Tim Goedeme, Guillame Osier (2013). Handbook on standard error estimation and other related sampling issues in EU-SILC, URL https://ec.europa.eu/eurostat/cros/content/handbook-standard-error-estimation-and-other-related-sampling-issues-ver-29072013_en
Working group on Statistics on Income and Living Conditions (2004) Common cross-sectional EU indicators based on EU-SILC; the gender pay gap. EU-SILC 131-rev/04, Eurostat.
See Also
Examples
library("data.table")
library("laeken")
data("eusilc")
dataset <- data.table(IDd = paste0("V", 1 : nrow(eusilc)), eusilc)
dataset1 <- dataset[1 : 1000]
#use dataset1 by default without using fh_zero (finite population correction)
aa <- varpoord(Y = "eqIncome", w_final = "rb050",
Y_thres = NULL, wght_thres = NULL,
ID_level1 = "db030", ID_level2 = "IDd",
H = "db040", PSU = "rb030", N_h = NULL,
sort = NULL, Dom = NULL,
gender = NULL, X = NULL,
X_ID_level1 = NULL, g = NULL,
q = NULL, datasetX = NULL,
dataset = dataset1, percentage = 60,
order_quant = 50L, alpha = 20,
confidence = .95, outp_lin = FALSE,
outp_res = FALSE, type = "linarpt")
aa
## Not run:
# use dataset1 by default with using fh_zero (finite population correction)
aa2 <- varpoord(Y = "eqIncome", w_final = "rb050",
Y_thres = NULL, wght_thres = NULL,
ID_level1 = "db030", ID_level2 = "IDd",
H = "db040", PSU = "rb030", N_h = NULL,
fh_zero = TRUE, sort = NULL, Dom = "db040",
gender = NULL, X = NULL, X_ID_level1 = NULL,
g = NULL, datasetX = NULL, dataset = dataset1,
percentage = 60, order_quant = 50L,
alpha = 20, confidence = .95, outp_lin = FALSE,
outp_res = FALSE, type = "linarpt")
aa2
aa2$all_result
# using dataset1
aa4 <- varpoord(Y = "eqIncome", w_final = "rb050",
Y_thres = NULL, wght_thres = NULL,
ID_level1 = "db030", ID_level2 = "IDd",
H = "db040", PSU = "rb030", N_h = NULL,
sort = NULL, Dom = "db040",
gender = NULL, X = NULL,
X_ID_level1 = NULL, g = NULL,
datasetX = NULL, dataset = dataset,
percentage = 60, order_quant = 50L,
alpha = 20, confidence = .95,
outp_lin = TRUE, outp_res = TRUE,
type = "linarpt")
aa4$lin_out[20 : 40]
## End(Not run)