R: t-test of differences in means/percentages relative to...

t_test_vs_external_estimate {nrba}

R Documentation

t-test of differences in means/percentages relative to external estimates

Description

Compare estimated means/percentages from the present survey to external estimates from a benchmark source. A t-test is used to evaluate whether the survey's estimates differ from the external estimates.

Usage

t_test_vs_external_estimate(
  survey_design,
  y_var,
  ext_ests,
  ext_std_errors = NULL,
  na.rm = TRUE,
  null_difference = 0,
  alternative = "unequal",
  degrees_of_freedom = survey::degf(survey_design) - 1
)

Arguments

`survey_design`	A survey design object created with the `survey` package.
`y_var`	Name of dependent variable. For categorical variables, percentages of each category are tested.
`ext_ests`	A numeric vector containing the external estimate of the mean for the dependent variable. If `variable` is a categorical variable, a named vector of means must be provided.
`ext_std_errors`	(Optional) The standard errors of the external estimates. This is useful if the external data are estimated with an appreciable level of uncertainty, for instance if the external data come from a survey with a small-to-moderate sample size. If supplied, the variance of the difference between the survey and external estimates is estimated by adding the variance of the external estimates to the estimated variance of the survey's estimates.
`na.rm`	Whether to drop cases with missing values for `y_var`
`null_difference`	The hypothesized difference between the estimate and the external mean. Default is `0`.
`alternative`	Can be one of the following: `'unequal'`: two-sided test of whether difference in means is equal to `null_difference` `'less'`: one-sided test of whether difference is less than `null_difference` `'greater'`: one-sided test of whether difference is greater than `null_difference`
`degrees_of_freedom`	The degrees of freedom to use for the test's reference distribution. Unless specified otherwise, the default is the design degrees of freedom minus one, where the design degrees of freedom are estimated using the survey package's `degf` method.

Value

A data frame describing the results of the t-tests, one row per mean being compared.

References

See Brick and Bose (2001) for an example of this analysis method and a discussion of its limitations.

Brick, M., and Bose, J. (2001). Analysis of Potential Nonresponse Bias. in Proceedings of the Section on Survey Research Methods. Alexandria, VA: American Statistical Association. http://www.asasrms.org/Proceedings/y2001/Proceed/00021.pdf

Examples



library(survey)

# Create a survey design ----
data("involvement_survey_str2s", package = 'nrba')

involvement_survey_sample <- svydesign(
  data = involvement_survey_str2s,
  weights = ~ BASE_WEIGHT,
  strata =  ~ SCHOOL_DISTRICT,
  ids =     ~ SCHOOL_ID             + UNIQUE_ID,
  fpc =     ~ N_SCHOOLS_IN_DISTRICT + N_STUDENTS_IN_SCHOOL
)

# Subset to only include survey respondents ----

involvement_survey_respondents <- subset(involvement_survey_sample,
                                         RESPONSE_STATUS == "Respondent")

# Test whether percentages of categorical variable differ from benchmark ----

parent_email_benchmark <- c(
  'Has Email' = 0.85,
  'No Email' = 0.15
)

t_test_vs_external_estimate(
  survey_design = involvement_survey_respondents,
  y_var = "PARENT_HAS_EMAIL",
  ext_ests = parent_email_benchmark
)

# Test whether the sample mean differs from the population benchmark ----

average_age_benchmark <- 11

t_test_vs_external_estimate(
  survey_design = involvement_survey_respondents,
  y_var = "STUDENT_AGE",
  ext_ests = average_age_benchmark,
  null_difference = 0
)

[Package nrba version 0.3.1 Index]