| r6pack {robustbase} | R Documentation |
Robust Distance based observation orderings based on robust "Six pack"
Description
Compute six initial robust estimators of multivariate location and
“scatter” (scale); then, for each, compute the distances
d_{ij} and take the h (h > n/2) observations
with smallest distances. Then compute the statistical distances based
on these h observations.
Return the indices of the observations sorted in increasing order.
Usage
r6pack(x, h, full.h, scaled = TRUE, scalefn = rrcov.control()$scalefn)
Arguments
x |
n x p data matrix |
h |
integer, typically around (and slightly larger than) |
full.h |
logical specifying if the full (length n) observation
ordering should be returned; otherwise only the first |
scaled |
logical indicating if the data |
scalefn |
a |
Details
The six initial estimators are
Hyperbolic tangent of standardized data
Spearmann correlation matrix
Tukey normal scores
Spatial sign covariance matrix
BACON
Raw OGK estimate for scatter
Value
a h' \times 6 matrix of observation
indices, i.e., with values from 1,\dots,n. If
full.h is true, h' = n, otherwise h' = h.
Author(s)
Valentin Todorov, based on the original Matlab code by
Tim Verdonck and Mia Hubert. Martin Maechler for tweaks
(performance etc), and full.h.
References
Hubert, M., Rousseeuw, P. J. and Verdonck, T. (2012) A deterministic algorithm for robust location and scatter. Journal of Computational and Graphical Statistics 21, 618–637.
See Also
covMcd(*, nsamp = "deterministic");
CovSest(*, nsamp = "sdet") from package rrcov.
Examples
data(pulpfiber)
dim(m.pulp <- data.matrix(pulpfiber)) # 62 x 8
dim(fr6 <- r6pack(m.pulp, h = 40, full.h= FALSE)) # h x 6 = 40 x 6
dim(fr6F <- r6pack(m.pulp, h = 40, full.h= TRUE )) # n x 6 = 62 x 6
stopifnot(identical(fr6, fr6F[1:40,]))