DCARContControl-class {simFrame} | R Documentation |
Class "DCARContControl"
Description
Class for controlling contamination in a simulation experiment. The values of the contaminated observations will be distributed completely at random (DCAR), i.e., they will not depend on on the original values.
Objects from the Class
Objects can be created by calls of the form
new("DCARContControl", ...)
, DCARContControl(...)
or
ContControl(..., type="DCAR")
(the latter exists mainly for back
compatibility with early draft versions of simFrame
).
Slots
target
:Object of class
"OptCharacter"
; a character vector specifying specifying the variables (columns) to be contaminated, orNULL
to contaminate all variables (except the additional ones generated internally).epsilon
:Object of class
"numeric"
giving the contamination levels.grouping
:Object of class
"character"
specifying a grouping variable (column) to be used for contaminating whole groups rather than individual observations (the same values are used for all observations in the same group).aux
:Object of class
"character"
specifying an auxiliary variable (column) whose values are used as probability weights for selecting the items (observations or groups) to be contaminated.distribution
:Object of class
"function"
generating the values of the contamination data, e.g.,rnorm
(the default) orrmvnorm
from package mvtnorm. It should take a non-negative integer as its first argument, giving the number of items to be created, and return an object that can be coerced to adata.frame
, containing the contamination data.dots
:Object of class
"list"
containing additional arguments to be passed todistribution
.
Extends
Class "ContControl"
, directly.
Class "VirtualContControl"
, by class "ContControl", distance 2.
Class "OptContControl"
, by class "ContControl", distance 3.
Details
With this control class, contamination is modeled as a two-step process. The
first step is to select observations to be contaminated, the second is to
model the distribution of the outliers. In this case, the values of the
contaminated observations will be generated by the function given by slot
fun
and will not depend on on the original values.
Accessor and mutator methods
In addition to the accessor and mutator methods for the slots inherited from
"ContControl"
, the following are available:
getDistribution
signature(x = "DCARContControl")
: get slotdistribution
.setDistribution
signature(x = "DCARContControl")
: set slotdistribution
.getDots
signature(x = "DCARContControl")
: get slotdots
.setDots
signature(x = "DCARContControl")
: set slotdots
.
Methods
Methods are inherited from "ContControl"
.
UML class diagram
A slightly simplified UML class diagram of the framework can be found in
Figure 1 of the package vignette An Object-Oriented Framework for
Statistical Simulation: The R Package simFrame
. Use
vignette("simFrame-intro")
to view this vignette.
Note
The slot grouping
was named group
prior to version 0.2.
Renaming the slot was necessary since accessor and mutator functions were
introduced in this version and a function named getGroup
already
exists.
Author(s)
Andreas Alfons
References
Alfons, A., Templ, M. and Filzmoser, P. (2010) An Object-Oriented Framework for Statistical Simulation: The R Package simFrame. Journal of Statistical Software, 37(3), 1–36. doi: 10.18637/jss.v037.i03.
Alfons, A., Templ, M. and Filzmoser, P. (2010) Contamination Models in the R Package simFrame for Statistical Simulation. In Aivazian, S., Filzmoser, P. and Kharin, Y. (editors) Computer Data Analysis and Modeling: Complex Stochastic Data and Systems, volume 2, 178–181. Minsk. ISBN 978-985-476-848-9.
Béguin, C. and Hulliger, B. (2008) The BACON-EEM Algorithm for Multivariate Outlier Detection in Incomplete Survey Data. Survey Methodology, 34(1), 91–103.
Hulliger, B. and Schoch, T. (2009) Robust Multivariate Imputation with Survey Data. 57th Session of the International Statistical Institute, Durban.
See Also
"DARContControl"
, "ContControl"
,
"VirtualContControl"
, contaminate
Examples
data(eusilcP)
sam <- draw(eusilcP[, c("id", "eqIncome")], size = 20)
cc <- DCARContControl(target = "eqIncome", epsilon = 0.05,
dots = list(mean = 5e+05, sd = 10000))
contaminate(sam, cc)