OutlierSign1 {rrcovHD}R Documentation

Outlier identification in high dimensions using the SIGN1 algorithm

Description

Fast algorithm for identifying multivariate outliers in high-dimensional and/or large datasets, using spatial signs, see Filzmoser, Maronna, and Werner (CSDA, 2007). The computation of the distances is based on Mahalanobis distances.

Usage

    OutlierSign1(x, ...)
    ## Default S3 method:
OutlierSign1(x, grouping, qcrit = 0.975, trace=FALSE, ...)
    ## S3 method for class 'formula'
OutlierSign1(formula, data, ..., subset, na.action)

Arguments

formula

a formula with no response variable, referring only to numeric variables.

data

an optional data frame (or similar: see model.frame) containing the variables in the formula formula.

subset

an optional vector used to select rows (observations) of the data matrix x.

na.action

a function which indicates what should happen when the data contain NAs. The default is set by the na.action setting of options, and is na.fail if that is unset. The default is na.omit.

...

arguments passed to or from other methods.

x

a matrix or data frame.

grouping

grouping variable: a factor specifying the class for each observation.

qcrit

a numeric value between 0 and 1 indicating the quantile to be used as critical value for outlier detection (default to 0.975).

trace

whether to print intermediate results. Default is trace = FALSE

Details

Based on the robustly sphered and normed data, robust principal components are computed. These are used for computing the covariance matrix which is the basis for Mahalanobis distances. A critical value from the chi-square distribution is then used as outlier cutoff.

Value

An S4 object of class OutlierSign1 which is a subclass of the virtual class Outlier.

Author(s)

Valentin Todorov valentin.todorov@chello.at

References

P. Filzmoser, R. Maronna and M. Werner (2008). Outlier identification in high dimensions, Computational Statistics & Data Analysis, Vol. 52 1694–1711.

Filzmoser P & Todorov V (2013). Robust tools for the imperfect world, Information Sciences 245, 4–20. doi:10.1016/j.ins.2012.10.017.

See Also

OutlierSign1, OutlierSign2, Outlier

Examples


data(hemophilia)
obj <- OutlierSign1(gr~.,data=hemophilia)
obj

getDistance(obj)            # returns an array of distances
getClassLabels(obj, 1)      # returns an array of indices for a given class
getCutoff(obj)              # returns an array of cutoff values (for each class, usually equal)
getFlag(obj)                #  returns an 0/1 array of flags
plot(obj, class=2)          # standard plot function

[Package rrcovHD version 0.3-0 Index]