assess_range_of_bias {nrba}R Documentation

Assess the range of possible bias based on specified assumptions about how nonrespondents differ from respondents

Description

This range-of-bias analysis assesses the range of possible nonresponse bias under varying assumptions about how nonrespondents differ from respondents. The range of potential bias is calculated for both unadjusted estimates (i.e., from using base weights) and nonresponse-adjusted estimates (i.e., based on nonresponse-adjusted weights).

Usage

assess_range_of_bias(
  survey_design,
  y_var,
  comparison_cell,
  status,
  status_codes,
  assumed_multiple = c(0.5, 0.75, 0.9, 1.1, 1.25, 1.5),
  assumed_percentile = NULL
)

Arguments

survey_design

A survey design object created with the 'survey' package

y_var

Name of a variable whose mean or proportion is to be estimated

comparison_cell

(Optional) The name of a variable in the data dividing the sample into cells. If supplied, then the analysis is based on assumptions about differences between respondents and nonrespondents within the same cell. Typically, the variable used is a nonresponse adjustment cell or post-stratification variable.

status

A character string giving the name of the variable representing response/eligibility status. The status variable should have at most four categories, representing eligible respondents (ER), eligible nonrespondents (EN), known ineligible cases (IE), and cases whose eligibility is unknown (UE).

status_codes

A named vector, with four entries named 'ER', 'EN', 'IE', and 'UE'. status_codes indicates how the values of the status variable are to be interpreted.

assumed_multiple

One or more numeric values. Within each nonresponse adjustment cell, the mean for nonrespondents is assumed to be a specified multiple of the mean for respondents. If y_var is a categorical variable, then the assumed nonrespondent mean (i.e., the proportion) in each cell is capped at 1.

assumed_percentile

One or more numeric values, ranging from 0 to 1. Within each nonresponse adjustment cell, the mean of a continuous variable among nonrespondents is assumed to equal a specified percentile of the variable among respondents. The assumed_percentile parameter should be used only when the y_var variable is numeric. Quantiles are estimated with weights, using the function svyquantile(..., qrule = "hf2").

Value

A data frame summarizing the range of bias under each assumption. For a numeric outcome variable, there is one row per value of assumed_multiple or assumed_percentile. For a categorical outcome variable, there is one row per combination of category and assumed_multiple or assumed_percentile.

The column bias_of_unadj_estimate is the nonresponse bias of the estimate from respondents produced using the unadjusted weights. The column bias_of_adj_estimate is the nonresponse bias of the estimate from respondents produced using nonresponse-adjusted weights, based on a weighting-class adjustment with comparison_cell as the weighting class variable. If no comparison_cell is specified, the two bias estimates will be the same.

References

See Petraglia et al. (2016) for an example of a range-of-bias analysis using these methods.

Examples

# Load example data

suppressPackageStartupMessages(library(survey))
data(api)

base_weights_design <- svydesign(
  data    = apiclus1,
  id      = ~dnum,
  weights = ~pw,
  fpc     = ~fpc
) |> as.svrepdesign(type = "JK1")

base_weights_design$variables$response_status <- sample(
  x = c("Respondent", "Nonrespondent"),
  prob = c(0.75, 0.25),
  size = nrow(base_weights_design),
  replace = TRUE
)

# Assess range of bias for mean of `api00`
# based on assuming nonrespondent means
# are equal to the 25th percentile or 75th percentile
# among respondents, within nonresponse adjustment cells

  assess_range_of_bias(
    survey_design = base_weights_design,
    y_var = "api00",
    comparison_cell = "stype",
    status = "response_status",
    status_codes = c("ER" = "Respondent",
                     "EN" = "Nonrespondent",
                     "IE" = "Ineligible",
                     "UE" = "Unknown"),
    assumed_percentile = c(0.25, 0.75)
  )

# Assess range of bias for proportions of `sch.wide`
# based on assuming nonrespondent proportions
# are equal to some multiple of respondent proportions,
# within nonresponse adjustment cells

  assess_range_of_bias(
    survey_design = base_weights_design,
    y_var = "sch.wide",
    comparison_cell = "stype",
    status = "response_status",
    status_codes = c("ER" = "Respondent",
                     "EN" = "Nonrespondent",
                     "IE" = "Ineligible",
                     "UE" = "Unknown"),
    assumed_multiple = c(0.25, 0.75)
  )

[Package nrba version 0.3.1 Index]