which.outlier {NCmisc} | R Documentation |
Return vector indexes of statistical univariate outliers
Description
Performs simplistic outlier detection and returns indexes for outliers. Acts like the which() function, return indices of elements of a vector satisfying the condition, which by default are outliers exceeding 2 SD above or below the mean. However, the threshold can be specified, only high or low values can be considered outliers, and percentile and interquartile range thresholds can also be used.
Usage
which.outlier(
x,
thr = 2,
method = c("sd", "iq", "pc"),
high = TRUE,
low = TRUE
)
Arguments
x |
numeric, or coercible, the vector to test for outliers |
thr |
numeric, threshold for cutoff, e.g, when method="sd", standard deviations, when 'iq', interquartile ranges (thr=1.5 is most typical here), or when 'pc', you might select the extreme 1%, 5%, etc. |
method |
character, one of "sd","iq" or "pc", selecting whether to test for outliers by standard deviation, interquartile range, or percentile. |
high |
logical, whether to test for outliers greater than the mean |
low |
logical, whether to test for outliers less than the mean |
Value
indexes of the vector x that are outliers according to either a SD cutoff, interquartile range, or percentile threshold, above (high) and/or below (low) the mean/median.
Examples
test.vec <- rnorm(200)
summary(test.vec)
ii <- which.outlier(test.vec) # 2 SD outliers
prv(ii); vals <- test.vec[ii]; prv(vals)
ii <- which.outlier(test.vec,1.5,"iq") # e.g, 'stars' on a box-plot
prv(ii)
ii <- which.outlier(test.vec,5,"pc",low=FALSE) # only outliers >mean
prv(ii)