getHDoutliers {HDoutliers} | R Documentation |
Outlier Detection Stage of Wilkinson's hdoutliers Algorithm
Description
Detects outliers based on a probability model.
Usage
getHDoutliers(data, memberLists, alpha = 0.05, transform = TRUE)
Arguments
data |
A vector, matrix, or data frame consisting of numeric and/or categorical variables. |
memberLists |
A list following the structure of the output to |
alpha |
Threshold for determining the cutoff for outliers.
Observations are considered outliers
outliers if they fall in the |
transform |
A logical variable indicating whether or not the data needs to be
transformed to conform to Wilkinson's specifications before outlier
detection. The default is to transform the data using function
|
Details
An exponential distribution is fitted to the upper tail of the
nearest-neighbor distances between exemplars (the observations
considered representatives of each component of memberLists
).
Observations are considered
outliers if they fall in the (1- alpha)
tail of the fitted CDF.
Value
The indexes of the observations determined to be outliers.
References
Wilkinson, L. (2016). Visualizing Outliers. <https://www.cs.uic.edu/~wilkinson/Publications/outliers.pdf>.
Note
A call to getHDoutliers
in which membersLists
result from
a call to getHDmembers
is equivalent to calling HDoutliers
.
See Also
HDoutliers
,
getHDmembers
,
dataTrans
Examples
data(dots)
mem.W <- getHDmembers(dots$W)
out.W <- getHDoutliers(dots$W,mem.W)
## Not run:
plotHDoutliers( dots.W, out.W)
## End(Not run)
data(ex2D)
mem.ex2D <- getHDmembers(ex2D)
out.ex2D <- getHDoutliers( ex2D, mem.ex2D)
## Not run:
plotHDoutliers( ex2D, out.ex2D)
## End(Not run)
## Not run:
n <- 100000 # number of observations
set.seed(3)
x <- matrix(rnorm(2*n),n,2)
nout <- 10 # number of outliers
x[sample(1:n,size=nout),] <- 10*runif(2*nout,min=-1,max=1)
mem.x <- getHDmembers(x)
out.x <- getHDoutliers(x)
## End(Not run)