oneCatCI {SynthTools} | R Documentation |
Confidence intervals and standard errors for one synthetic categorical variable of derived with multiply imputed datasets.
Description
This function will calculate confidence intervals and standard errors from the proportional responses of multiply imputed datasets for a specified categorical variable, and also gives a YES/NO indicator for whether or not the observed value is within the confidence interval. The confidence intervals and standard errors are calculated from variance formulas that are specific to whether the multiple imputed datasets are fully or partially synthetic. See reference for more information.
Usage
oneCatCI(obs_data, imp_data_list, type, var, sig = 6, alpha = 0.05)
Arguments
obs_data |
The original dataset to which the next will be compared, of the type "data.frame". |
imp_data_list |
A list of datasets that are either synthetic or contain imputed values. |
type |
Specifies which type of datasets are in |
var |
The categorical variable being checked. Should be of type "factor". |
sig |
The number of significant digits in the output dataframe. Defaults to 6. |
alpha |
Test size, defaults to 0.05. |
Details
This function was developed with the intention of making the job of researching synthetic data utility a bit easier by providing another way of measuring utility.
Value
This function returns a dataframe with the variable's responses, observed values, lower and upper limits of the confidence interval, standard error, and "YES"/"NO" indicating whether or not the observed value is within the confidence interval.
References
Reiter JP, Raghunathan TE (2007). “The Multiple Adaptations of Multiple Imputation.” Journal of the American Statistical Association.
Examples
#PPA is observed data set, PPAm5 is a list of 5 partially synthetic data sets derived from PPA.
#sex is a categorical variable within these data sets. 3 significant digits are desired.
oneCatCI(obs_data=PPA, imp_data_list=PPAm5, type="partially", var="sex", sig=3)