assess_range_of_bias {nrba} | R Documentation |
Assess the range of possible bias based on specified assumptions about how nonrespondents differ from respondents
Description
This range-of-bias analysis assesses the range of possible nonresponse bias under varying assumptions about how nonrespondents differ from respondents. The range of potential bias is calculated for both unadjusted estimates (i.e., from using base weights) and nonresponse-adjusted estimates (i.e., based on nonresponse-adjusted weights).
Usage
assess_range_of_bias(
survey_design,
y_var,
comparison_cell,
status,
status_codes,
assumed_multiple = c(0.5, 0.75, 0.9, 1.1, 1.25, 1.5),
assumed_percentile = NULL
)
Arguments
survey_design |
A survey design object created with the 'survey' package |
y_var |
Name of a variable whose mean or proportion is to be estimated |
comparison_cell |
(Optional) The name of a variable in the data dividing the sample into cells. If supplied, then the analysis is based on assumptions about differences between respondents and nonrespondents within the same cell. Typically, the variable used is a nonresponse adjustment cell or post-stratification variable. |
status |
A character string giving the name of the variable representing response/eligibility status. The status variable should have at most four categories, representing eligible respondents (ER), eligible nonrespondents (EN), known ineligible cases (IE), and cases whose eligibility is unknown (UE). |
status_codes |
A named vector,
with four entries named 'ER', 'EN', 'IE', and 'UE'.
|
assumed_multiple |
One or more numeric values.
Within each nonresponse adjustment cell,
the mean for nonrespondents is assumed to be a specified multiple
of the mean for respondents. If |
assumed_percentile |
One or more numeric values, ranging from 0 to 1.
Within each nonresponse adjustment cell,
the mean of a continuous variable among nonrespondents is
assumed to equal a specified percentile of the variable among respondents.
The |
Value
A data frame summarizing the range of bias under each assumption.
For a numeric outcome variable, there is one row per value of
assumed_multiple
or assumed_percentile
. For a categorical
outcome variable, there is one row per combination of category
and assumed_multiple
or assumed_percentile
.
The column bias_of_unadj_estimate
is the nonresponse bias
of the estimate from respondents produced using the unadjusted weights.
The column bias_of_adj_estimate
is the nonresponse bias
of the estimate from respondents produced
using nonresponse-adjusted weights, based on a weighting-class
adjustment with comparison_cell
as the weighting class variable.
If no comparison_cell
is specified, the two bias estimates
will be the same.
References
See Petraglia et al. (2016) for an example of a range-of-bias analysis using these methods.
Petraglia, E., Van de Kerckhove, W., and Krenzke, T. (2016). Review of the Potential for Nonresponse Bias in FoodAPS 2012. Prepared for the Economic Research Service, U.S. Department of Agriculture. Washington, D.C.
Examples
# Load example data
suppressPackageStartupMessages(library(survey))
data(api)
base_weights_design <- svydesign(
data = apiclus1,
id = ~dnum,
weights = ~pw,
fpc = ~fpc
) |> as.svrepdesign(type = "JK1")
base_weights_design$variables$response_status <- sample(
x = c("Respondent", "Nonrespondent"),
prob = c(0.75, 0.25),
size = nrow(base_weights_design),
replace = TRUE
)
# Assess range of bias for mean of `api00`
# based on assuming nonrespondent means
# are equal to the 25th percentile or 75th percentile
# among respondents, within nonresponse adjustment cells
assess_range_of_bias(
survey_design = base_weights_design,
y_var = "api00",
comparison_cell = "stype",
status = "response_status",
status_codes = c("ER" = "Respondent",
"EN" = "Nonrespondent",
"IE" = "Ineligible",
"UE" = "Unknown"),
assumed_percentile = c(0.25, 0.75)
)
# Assess range of bias for proportions of `sch.wide`
# based on assuming nonrespondent proportions
# are equal to some multiple of respondent proportions,
# within nonresponse adjustment cells
assess_range_of_bias(
survey_design = base_weights_design,
y_var = "sch.wide",
comparison_cell = "stype",
status = "response_status",
status_codes = c("ER" = "Respondent",
"EN" = "Nonrespondent",
"IE" = "Ineligible",
"UE" = "Unknown"),
assumed_multiple = c(0.25, 0.75)
)