t_test_vs_external_estimate {nrba}R Documentation

t-test of differences in means/percentages relative to external estimates

Description

Compare estimated means/percentages from the present survey to external estimates from a benchmark source. A t-test is used to evaluate whether the survey's estimates differ from the external estimates.

Usage

t_test_vs_external_estimate(
  survey_design,
  y_var,
  ext_ests,
  ext_std_errors = NULL,
  na.rm = TRUE,
  null_difference = 0,
  alternative = "unequal",
  degrees_of_freedom = survey::degf(survey_design) - 1
)

Arguments

survey_design

A survey design object created with the survey package.

y_var

Name of dependent variable. For categorical variables, percentages of each category are tested.

ext_ests

A numeric vector containing the external estimate of the mean for the dependent variable. If variable is a categorical variable, a named vector of means must be provided.

ext_std_errors

(Optional) The standard errors of the external estimates. This is useful if the external data are estimated with an appreciable level of uncertainty, for instance if the external data come from a survey with a small-to-moderate sample size. If supplied, the variance of the difference between the survey and external estimates is estimated by adding the variance of the external estimates to the estimated variance of the survey's estimates.

na.rm

Whether to drop cases with missing values for y_var

null_difference

The hypothesized difference between the estimate and the external mean. Default is 0.

alternative

Can be one of the following:

  • 'unequal': two-sided test of whether difference in means is equal to null_difference

  • 'less': one-sided test of whether difference is less than null_difference

  • 'greater': one-sided test of whether difference is greater than null_difference

degrees_of_freedom

The degrees of freedom to use for the test's reference distribution. Unless specified otherwise, the default is the design degrees of freedom minus one, where the design degrees of freedom are estimated using the survey package's degf method.

Value

A data frame describing the results of the t-tests, one row per mean being compared.

References

See Brick and Bose (2001) for an example of this analysis method and a discussion of its limitations.

Examples



library(survey)

# Create a survey design ----
data("involvement_survey_str2s", package = 'nrba')

involvement_survey_sample <- svydesign(
  data = involvement_survey_str2s,
  weights = ~ BASE_WEIGHT,
  strata =  ~ SCHOOL_DISTRICT,
  ids =     ~ SCHOOL_ID             + UNIQUE_ID,
  fpc =     ~ N_SCHOOLS_IN_DISTRICT + N_STUDENTS_IN_SCHOOL
)

# Subset to only include survey respondents ----

involvement_survey_respondents <- subset(involvement_survey_sample,
                                         RESPONSE_STATUS == "Respondent")

# Test whether percentages of categorical variable differ from benchmark ----

parent_email_benchmark <- c(
  'Has Email' = 0.85,
  'No Email' = 0.15
)

t_test_vs_external_estimate(
  survey_design = involvement_survey_respondents,
  y_var = "PARENT_HAS_EMAIL",
  ext_ests = parent_email_benchmark
)

# Test whether the sample mean differs from the population benchmark ----

average_age_benchmark <- 11

t_test_vs_external_estimate(
  survey_design = involvement_survey_respondents,
  y_var = "STUDENT_AGE",
  ext_ests = average_age_benchmark,
  null_difference = 0
)


[Package nrba version 0.3.1 Index]