impute_w8margin {svyweight}R Documentation

Impute NAs in w8margin Object

Description

Imputes NA values in a weight target (in w8margin form), based on the observed distribution of the variable in a dataset.

Usage

impute_w8margin(w8margin, observed, weights = NULL, rebase = TRUE)

Arguments

w8margin

w8margin object, with NA values that should be imputed based on observed data.

observed

factor or character vector, containing observed data used for imputing targets.

weights

numeric vector of weights, the same length as observed, to be used when computing the distribution of the observed variable. NULL is equivalent to a vector where all elements are 1, and indicates the data is unweighted.

rebase

logical, indicating whether non-NA weight targets should be adjusted so that the total target sample size is unchanged (rebase = TRUE), or whether non-NA weight targets should remain the same and total target sample size increases.

Details

Any NA target frequencies in w8margin are imputed using the percentage distribution in observed, from svytable(~observed, Ntotal = 1, ...). The percentage is multiplied by the desired target sample size. For example, if has a target of NA and a desired total sample of 1500, and the observed frequency of the weighting variable is 0%, the imputed target will be (10% * 1500). If a weights argument is provided, then weighted percentage distributions are used; this may be useful when design weights are present, or when first raking on variables with complete targets.

If rebase == TRUE (the default), targets for non-NA categories are scaled down so that the total target frequency (sum(w8margin$Freq, na.rm = TRUE)) remains constant, after imputing new category targets. If rebase == FALSE, targets for non-NA categories remain constant, and the total target frequency will increase.

There is an important theoretical distinction between missing targets for conceptually valid categories, versus missing observed data due to non-response or refusal. It is only conceptually appropriate to impute targets if the targets themselves are missing. When handling missing observed data, multiple imputation techniques (such as mice::mice()) will often produce better results, except when missingness is closely related to weighting variable (technically referred to as "missing not at random").

Value

A w8margin object, where NA target frequencies have been replaced using the observed distribution of the weighting variable.

Examples

turnout_w8margin <- as.w8margin(
    c(voted = .715, `did not vote` = .285, ineligible = NA), 
    varname = "turnout2013", 
    na.allow = TRUE,
    samplesize = 1500)
impute_w8margin(turnout_w8margin, observed = gles17$turnout2013)

[Package svyweight version 0.1.0 Index]