twoCatCI {SynthTools} | R Documentation |
Confidence intervals and standard errors for the cross-tabulation of two categorical variables of derived with multiply imputed datasets.
Description
This function will calculate confidence intervals and standard errors from the proportional tabular responses of multiply imputed datasets for the cross-tabulation of two categorical variables, and also give a YES/NO indicator for whether or not the observed value is within the confidence interval. The confidence intervals and standard errors are calculated from formulas that are adapted for fully and partially synthetic data sets. See reference for more information.
Usage
twoCatCI(obs_data, imp_data_list, type, vars, sig = 4, alpha = 0.05)
Arguments
obs_data |
The original dataset to which the next will be compared, of the type "data.frame". |
imp_data_list |
A list composed of |
type |
Specifies which type of datasets are in |
vars |
A vector of the two categorical variable being checked. Should be of type "factor". |
sig |
The number of significant digits in the output dataframes. Defaults to 4. |
alpha |
Test size, defaults to 0.05. |
Details
This function was developed with the intention of making the job of researching synthetic data utility a bit easier by providing another way of measuring utility.
Value
This function returns a list of five data frames:
Observed |
A cross-tabular proportion of observed values |
Lower |
Lower limit of the confidence interval |
Upper |
Upper limit of the confidence interval |
SEs |
Standard Errors |
CI_Indicator |
"YES"/"NO" indicating whether or not the observed value is within the confidence interval |
References
Reiter JP, Raghunathan TE (2007). “The Multiple Adaptations of Multiple Imputation.” Journal of the American Statistical Association.
Examples
#PPA is the observed data set. PPAm5 is a list of 5 partially synthetic data sets derived from PPA.
#"sex" and "race" are categorical variables present in the synthesized data sets.
#3 significant digits are desired in the output dataframes.
twoCatCI(PPA, PPAm5, "partially", c("sex", "race"), sig=3)