LOF {bigutilsr}R Documentation

Local Outlier Factor (LOF)

Description

LOF: Identifying Density-Based Local Outliers.

Usage

LOF(
  U,
  seq_k = c(4, 10, 30),
  combine = max,
  robMaha = FALSE,
  log = TRUE,
  ncores = 1
)

Arguments

U

A matrix, from which to detect outliers (rows). E.g. PC scores.

seq_k

Sequence of numbers of nearest neighbors to use. If multiple k are provided, this returns the combination of statistics. Default is c(4, 10, 30) and use max to combine (see combine).

combine

How to combine results for multiple k? Default uses max.

robMaha

Whether to use a robust Mahalanobis distance instead of the normal euclidean distance? Default is FALSE, meaning using euclidean.

log

Whether to return the logarithm of LOFs? Default is TRUE.

ncores

Number of cores to use. Default is 1.

References

Breunig, Markus M., et al. "LOF: identifying density-based local outliers." ACM sigmod record. Vol. 29. No. 2. ACM, 2000.

See Also

prob_dist()

Examples

X <- readRDS(system.file("testdata", "three-pops.rds", package = "bigutilsr"))
svd <- svds(scale(X), k = 10)

llof <- LOF(svd$u)
hist(llof, breaks = nclass.scottRob)
tukey_mc_up(llof)

llof_maha <- LOF(svd$u, robMaha = TRUE)
hist(llof_maha, breaks = nclass.scottRob)
tukey_mc_up(llof_maha)

lof <- LOF(svd$u, log = FALSE)
hist(lof, breaks = nclass.scottRob)
str(hist_out(lof))
str(hist_out(lof, nboot = 100))
str(hist_out(lof, nboot = 100, breaks = "FD"))


[Package bigutilsr version 0.3.4 Index]