simulatemissings {compositions}  R Documentation 
Artifical simulation of various kinds of missings/polluted data
Description
These are simulation mechanisms to check that missing techniques perform in sensible ways. They just generate additional missings of the various types in a given dataset, according to a specific process.
Usage
simulateMissings(x, dl=NULL, knownlimit=FALSE,
MARprob=0.0, MNARprob=0.0, mnarity=0.5, SZprob=0.0)
observeWithAdditiveError(x, sigma=dl/dlf, dl=sigma*dlf, dlf=3,
keepObs=FALSE, digits=NA, obsScale=1,
class="acomp")
Arguments
x 
a dataset that should get the missings 
dl 
the detection limit described in

knownlimit 
a boolean indicating wether the actual detection limit is still known in the dataset. 
MARprob 
the probability of occurence of 'Missings At Random' values 
MNARprob 
the probability of occurrence of 'Missings Not At Random'. The tendency is that small values have a higher probability to be missed. 
mnarity 
a number between 0 and 1 giving the strength of the influence of the actual value in becoming a MNAR. 0 means a MAR like behavior and 1 means that it is just the smallest values that is lost 
SZprob 
the probability to obtain a structural zero. This is done at random like a MAR. 
sigma 
the standard deviation of the normal distributed extra additive error 
dlf 
the distance from 0 at which a datum will be considered BDL 
keepObs 
should the (closed) data without additive error be returned as an attribute? 
digits 
rounding to be applied to the data with additive error (see Details) 
obsScale 
rounding to be applied to the data with additive error (see Details). Should be a power of 10. 
class 
class of the output object 
Details
Without any additional parameters no missings are generated. The procedure to generate MNAR affects all variables.
Function "simulateMissings" is a multipurpose simulator, where each class of missing value is treated separately, and where detection limits are specified as thresholds.
Function "observeWithAdditiveError" simulates data within a very specific
framework, where an additive error of sd=sigma
is added to the input data
x
, and BDLs are generated if a datum is less than dfl
times
sigma
. Afterwards, the resulting data are rounded as
round(data/obsScale,digits)*obsScale
, i.e. a certain observation scale
obsScale
is chosen, and at that scale, only some digits
are kept.
This framework is typical of chemical analyses, and it generates both BDLs and
pollution/rounding of (apparently) "right" data.
Value
A dataset like x
but with some additional missings.
Author(s)
K.Gerald van den Boogaart
References
van den Boogaart, K., R. TolosanaDelgado, and M. Bren (2011). The Compositional Meaning of a Detection Limit. In Proceedings of the 4th International Workshop on Compositional Data Analysis (2011).
van den Boogaart, K.G., R. TolosanaDelgado and M. Templ (2014) Regression with compositional response having unobserved components or below detection limit values. Statistical Modelling (in press).
See compositions.missings for more details.
See Also
Examples
data(SimulatedAmounts)
x < acomp(sa.lognormals)
xnew < simulateMissings(x,dl=0.05,MAR=0.05,MNAR=0.05,SZ=0.05)
acomp(xnew)
plot(missingSummary(xnew))