summary.fit.synds {synthpop} | R Documentation |
Inference from synthetic data
Description
Combines the results of models fitted to each of the m
synthetic data sets.
Usage
## S3 method for class 'fit.synds'
summary(object, population.inference = FALSE, msel = NULL,
real.varcov = NULL, incomplete = NULL, ...)
## S3 method for class 'summary.fit.synds'
print(x, ...)
Arguments
object |
an object of class |
population.inference |
a logical value indicating whether inference
should be made to population quantities. If |
msel |
index or indices of the synthetic datasets ( |
real.varcov |
the estimated variance-covariance matrix of the fit of the
model to the original data. This parameter is used in the function
|
incomplete |
Logical variable as to whether population inference for
incomplete synthesis is to be used. If this is left at a |
... |
additional parameters. |
x |
an object of class |
Details
The mean of the estimates from each of the m synthetic data sets yields asymptotically unbiased estimates of the coefficients if the observed data conform to the distribution used for synthesis. The standard errors are estimated differently depending whether inference is made for the results that we would expect to obtain from the observed data or for the parameters of the population that we assume the observed data are sampled from. The standard errors also differ according to whether synthetic data were produced using simple or proper synthesis (for details see Raab et al. (2017)).
Value
An object of class summary.fit.synds
which is a list with the
following components:
call |
the original call to |
proper |
a logical value indicating whether synthetic data were generated using proper synthesis. |
population.inference |
a logical value indicating whether inference is made to population coefficients or to the results that would be expected from an analysis of the original data (see above). |
incomplete |
a logical value indicating whether the dependent variable
in the model was not synthesised. It is derived in the synthpop
implementation of the fitting functions ( |
fitting.function |
function used to fit the model. |
m |
the number of synthetic versions of the original (observed) data. |
coefficients |
a matrix with combined estimates. If inference is
required to the results that would be obtained from an analysis of the
original data, ( |
n |
a number of cases in the original data. |
k |
the number of cases in the synthesised data. Note that if |
analyses |
|
msel |
index or indices of synthetic data copies for which summaries
of fitted models are produced. If |
References
Nowok, B., Raab, G.M and Dibben, C. (2016). synthpop: Bespoke creation of synthetic data in R. Journal of Statistical Software, 74(11), 1-26. doi:10.18637/jss.v074.i11.
Raab, G.M., Nowok, B. and Dibben, C. (2017). Practical data synthesis for large samples. Journal of Privacy and Confidentiality, 7(3), 67-97. Available at: https://journalprivacyconfidentiality.org/index.php/jpc/article/view/407
Reiter, J.P. (2003) Inference for partially synthetic, public use microdata sets. Survey Methodology, 29, 181-188.
See Also
compare.fit.synds
, summary
, print
Examples
ods <- SD2011[1:1000,c("sex","age","edu","ls","smoke")]
### simple synthesis
s1 <- syn(ods, m = 5)
f1 <- glm.synds(smoke ~ sex + age + edu + ls, data = s1, family = "binomial")
summary(f1)
summary(f1, population.inference = TRUE)
### proper synthesis
s2 <- syn(ods, m = 5, method = "parametric", proper = TRUE)
f2 <- glm.synds(smoke ~ sex + age + edu + ls, data = s2, family = "binomial")
summary(f2)
summary(f2, population.inference = TRUE)