qvar {gustave} | R Documentation |
Quickly perform a variance estimation in common cases
Description
qvar
(for "quick variance estimation") is a function
performing analytical variance estimation in most common cases, that is:
stratified simple random sampling
non-response correction (if any) through reweighting
calibration (if any)
Used with define = TRUE
, it defines a so-called variance wrapper, that
is a standalone ready-to-use function that can be applied to the survey dataset
without having to specify the methodological characteristics of the survey
(see define_variance_wrapper
).
Usage
qvar(
data,
...,
by = NULL,
where = NULL,
alpha = 0.05,
display = TRUE,
id,
dissemination_dummy,
dissemination_weight,
sampling_weight,
strata = NULL,
scope_dummy = NULL,
nrc_weight = NULL,
response_dummy = NULL,
nrc_dummy = NULL,
calibration_weight = NULL,
calibration_dummy = NULL,
calibration_var = NULL,
define = FALSE,
envir = parent.frame()
)
Arguments
data |
The |
... |
One or more calls to a statistic wrapper (e.g. |
by |
A qualitative variable whose levels are used to define domains on which the variance estimation is performed. |
where |
A logical vector indicating a domain on which the variance estimation is to be performed. |
alpha |
A numeric vector of length 1 indicating the threshold
for confidence interval derivation ( |
display |
A logical verctor of length 1 indicating whether the result of the estimation should be displayed or not. |
id |
The identification variable of the units in |
dissemination_dummy |
A character vector of length 1, the name
of the logical variable in |
dissemination_weight |
A character vector of length 1, the name
of the numerical variable in |
sampling_weight |
A character vector of length 1, the name of the
numeric variable in |
strata |
A character vector of length 1, the name of the factor
variable in |
scope_dummy |
A character vector of length 1, the name of the logical
variable in |
nrc_weight |
A character vector of length 1, the name of the
numerical variable in |
response_dummy |
A character vector of length 1, the name of of the logical
variable in |
nrc_dummy |
A character vector of length 1, the name of
the logical variable in |
calibration_weight |
A character vector of length 1, the name of the
numerical variable in |
calibration_dummy |
A character vector of length 1, the name of of the logical
variable in |
calibration_var |
A character vector, the name of the variable(s) used in
the calibration process. Logical variables are coerced to numeric.
Character and factor variables are automatically discretized.
|
define |
Logical vector of lentgh 1. Should a variance wrapper be defined instead of performing a variance estimation (see details and examples)? |
envir |
An environment containing a binding to |
Details
qvar
performs not only technical but also
methodological checks in order to ensure that the standard variance
estimation methodology does apply (e.g. equal probability of
inclusion within strata, number of units per stratum).
Used with define = TRUE
, the function returns a variance
estimation wrapper, that is a ready-to-use function that
implements the described variance estimation methodology and
contains all necessary data to do so (see examples).
Note: To some extent, qvar
is analogous to the qplot
function
in the ggplot2 package, as it is an easier-to-use function for common
cases. More complex cases are to be handled by using the core functions of
the gustave package, e.g. define_variance_wrapper
.
See Also
define_variance_wrapper
, standard_statistic_wrapper
Examples
### Example from the Information and communication technologies (ICT) survey
# The (simulated) Information and communication technologies (ICT) survey
# has the following characteristics:
# - stratified one-stage sampling design
# - non-response correction through reweighting in homogeneous response groups
# - calibration on margins.
# The ict_survey data.frame is a (simulated) subset of the ICT
# survey file containing the variables of interest for the 612
# responding firms.
# The ict_sample data.frame is the (simulated) sample of 650
# firms corresponding to the ict_survey file. It contains all
# technical information necessary to estimate a variance with
# the qvar() function.
## Methodological description of the survey
# Direct call of qvar()
qvar(
# Sample file
data = ict_sample,
# Dissemination and identification information
dissemination_dummy = "dissemination",
dissemination_weight = "w_calib",
id = "firm_id",
# Scope
scope_dummy = "scope",
# Sampling design
sampling_weight = "w_sample",
strata = "strata",
# Non-response correction
nrc_weight = "w_nrc",
response_dummy = "resp",
hrg = "hrg",
# Calibration
calibration_weight = "w_calib",
calibration_var = c(paste0("N_", 58:63), paste0("turnover_", 58:63)),
# Statistic(s) and variable(s) of interest
mean(employees)
)
# Definition of a variance estimation wrapper
precision_ict <- qvar(
# As before
data = ict_sample,
dissemination_dummy = "dissemination",
dissemination_weight = "w_calib",
id = "firm_id",
scope_dummy = "scope",
sampling_weight = "w_sample",
strata = "strata",
nrc_weight = "w_nrc",
response_dummy = "resp",
hrg = "hrg",
calibration_weight = "w_calib",
calibration_var = c(paste0("N_", 58:63), paste0("turnover_", 58:63)),
# Replacing the variables of interest by define = TRUE
define = TRUE
)
# Use of the variance estimation wrapper
precision_ict(ict_sample, mean(employees))
# The variance estimation wrapper can also be used on the survey file
precision_ict(ict_survey, mean(speed_quanti))
## Features of the variance estimation wrapper
# Several statistics in one call (with optional labels)
precision_ict(ict_survey,
"Mean internet speed in Mbps" = mean(speed_quanti),
"Turnover per employee" = ratio(turnover, employees)
)
# Domain estimation with where and by arguments
precision_ict(ict_survey,
mean(speed_quanti),
where = employees >= 50
)
precision_ict(ict_survey,
mean(speed_quanti),
by = division
)
# Domain may differ from one estimator to another
precision_ict(ict_survey,
"Mean turnover, firms with 50 employees or more" = mean(turnover, where = employees >= 50),
"Mean turnover, firms with 100 employees or more" = mean(turnover, where = employees >= 100)
)
# On-the-fly evaluation (e.g. discretization)
precision_ict(ict_survey, mean(speed_quanti > 100))
# Automatic discretization for qualitative (character or factor) variables
precision_ict(ict_survey, mean(speed_quali))
# Standard evaluation capabilities
variables_of_interest <- c("speed_quanti", "speed_quali")
precision_ict(ict_survey, mean(variables_of_interest))
# Integration with %>% and dplyr
library(magrittr)
library(dplyr)
ict_survey %>%
precision_ict("Internet speed above 100 Mbps" = mean(speed_quanti > 100)) %>%
select(label, est, lower, upper)