r6pack {robustbase} | R Documentation |
Robust Distance based observation orderings based on robust "Six pack"
Description
Compute six initial robust estimators of multivariate location and
“scatter” (scale); then, for each, compute the distances
d_{ij}
and take the h
(h > n/2
) observations
with smallest distances. Then compute the statistical distances based
on these h observations.
Return the indices of the observations sorted in increasing order.
Usage
r6pack(x, h, full.h, scaled = TRUE, scalefn = rrcov.control()$scalefn)
Arguments
x |
n x p data matrix |
h |
integer, typically around (and slightly larger than) |
full.h |
logical specifying if the full (length n) observation
ordering should be returned; otherwise only the first |
scaled |
logical indicating if the data |
scalefn |
a |
Details
The six initial estimators are
Hyperbolic tangent of standardized data
Spearmann correlation matrix
Tukey normal scores
Spatial sign covariance matrix
BACON
Raw OGK estimate for scatter
Value
a h' \times 6
matrix
of observation
indices, i.e., with values from 1,\dots,n
. If
full.h
is true, h' = n
, otherwise h' = h
.
Author(s)
Valentin Todorov, based on the original Matlab code by
Tim Verdonck and Mia Hubert. Martin Maechler for tweaks
(performance etc), and full.h
.
References
Hubert, M., Rousseeuw, P. J. and Verdonck, T. (2012) A deterministic algorithm for robust location and scatter. Journal of Computational and Graphical Statistics 21, 618–637.
See Also
covMcd(*, nsamp = "deterministic")
;
CovSest(*, nsamp = "sdet")
from package rrcov.
Examples
data(pulpfiber)
dim(m.pulp <- data.matrix(pulpfiber)) # 62 x 8
dim(fr6 <- r6pack(m.pulp, h = 40, full.h= FALSE)) # h x 6 = 40 x 6
dim(fr6F <- r6pack(m.pulp, h = 40, full.h= TRUE )) # n x 6 = 62 x 6
stopifnot(identical(fr6, fr6F[1:40,]))