R: Natural Outlier Factor (NOF) algorithm

NOF {DDoutlier}

R Documentation

Natural Outlier Factor (NOF) algorithm

Description

Function to calculate the Natural Outlier Factor (NOF) as an outlier score for observations. Suggested by Huang, J., Zhu, Q., Yang, L. & Feng, J. (2015)

Usage

NOF(dataset)

Arguments

dataset

The dataset for which observations have a NOF score returned

Details

NOF computes the nearest and reverse nearest neighborhood for observations, based on the natural neighborhood algorithm. Density is compared between observations and their neighbors. A kd-tree is used for kNN computation, using the kNN() function from the 'dbscan' package

Value

`nb`	A vector of in-degrees for observations
`max_nb`	Maximum in-degree observations in nb vector. Used as k-parameter in outlier detection of NOF
`r`	The natural neighbor eigenvalue
`NOF`	A vector of Natural Outlier Factor scores. The greater the NOF, the greater the outlierness

Author(s)

Jacob H. Madsen

References

Huang, J., Zhu, Q., Yang, L. & Feng, J. (2015). A non-parameter outlier detection algorithm based on Natural Neighbor. Knowledge-Based Systems. pp. 71-77. DOI: 10.1016/j.knosys.2015.10.014

Examples

# Select dataset
X <- iris[,1:4]

# Run NOF algorithm
outlier_score <- NOF(dataset=X)$NOF

# Sort and find index for most outlying observations
names(outlier_score) <- 1:nrow(X)
sort(outlier_score, decreasing = TRUE)

# Inspect the distribution of outlier scores
hist(outlier_score)

[Package DDoutlier version 0.1.0 Index]