High dimensional MCD based detection of outliers {Rfast} | R Documentation |
High dimensional MCD based detection of outliers
Description
High dimensional MCD based detection of outliers.
Usage
rmdp(y, alpha = 0.05, itertime = 100, parallel = FALSE)
Arguments
y |
A matrix with numerical data with more columns (p) than rows (n), i.e. n<p. |
alpha |
The significance level, i.e. used to decide whether an observation is said to be considered a possible outlier. The default value is 0.05. |
itertime |
The number of iterations the algorithm will be ran. The higher the sample size, the larger this number must be.
With 50 observations in |
parallel |
A logical value for parallel version. |
Details
High dimensional outliers (n<<p) are detected using a properly constructed MCD. The variances of the variables are used and the determinant is simply their product.
Value
A list including: runtime = runtime, dis = dis, wei = wei
runtime |
The duration of the process. |
dis |
The final estimated Mahalanobis type normalised distances. |
wei |
A bollean variable vector specifying whether an observation is "clean" (TRUE) or a possible outlier (FALSE). |
cova |
The estimated covatriance matrix. |
Author(s)
Initial R code: Changliang Zou <nk.chlzou@gmail.com> R code modifications: Michail Tsagris <mtsagris@uoc.gr> C++ implementation: Manos Papadakis <papadakm95@gmail.com> Documentation: Michail Tsagris <mtsagris@uoc.gr> and Changliang Zhou <nk.chlzou@gmail.com>
References
Ro K., Zou C., Wang Z. and Yin G. (2015). Outlier detection for high-dimensional data. Biometrika, 102(3):589-599.
See Also
Examples
x <- matrix(rnorm(50 * 400), ncol = 400)
a <- rmdp(x, itertime = 500)
x<-a<-NULL