multi.compare {synthpop} | R Documentation |
Multivariate comparison of synthesised and observed data
Description
Graphical comparisons of a variable (var
) in the synthesised data set
with the original (observed) data set within subgroups defined by the
variables in a vector by
. var
can be a factor or a continuous
variable and the plots produced will depend on the class of var
.
The variables in by
will usually be factors or variables with only
a few values.
Usage
multi.compare(object, data, var = NULL, by = NULL, msel = NULL,
barplot.position = "fill", cont.type = "hist", y.hist = "count",
boxplot.point = TRUE, binwidth = NULL, ...)
Arguments
object |
an object of class |
data |
an original (observed) data set. |
var |
variable to be compared between observed and synthetic data within subgroups. |
by |
variables to be tabulated or cross-tabulated to form groups. |
barplot.position |
type of barplot. The default |
cont.type |
default |
y.hist |
defines y scale for histograms - |
boxplot.point |
default ( |
msel |
numbers of synthetic data sets to be used - must be numbers in
the range |
binwidth |
sets width of a bin for histograms. |
... |
additional parameters that can be supplied to |
Value
Plots as specified above. A table of the numbers in the subgroups is printed to the R console.
Numeric variables with fewer than 6 distinct values are changed to factors in order to make plots more readable.
See Also
compare.synds
, compare.fit.synds
Examples
### default synthesis of selected variables
vars <- c("sex", "age", "edu", "smoke")
ods <- na.omit(SD2011[1:1000, vars])
s1 <- syn(ods)
### categorical var
multi.compare(s1, ods, var = "smoke", by = c("sex","edu"))
### numeric var
multi.compare(s1, ods, var = "age", by = c("sex"), y.hist = "density", binwidth = 5)
multi.compare(s1, ods, var = "age", by = c("sex", "edu"), cont.type = "boxplot")