NOF {DDoutlier} | R Documentation |
Natural Outlier Factor (NOF) algorithm
Description
Function to calculate the Natural Outlier Factor (NOF) as an outlier score for observations. Suggested by Huang, J., Zhu, Q., Yang, L. & Feng, J. (2015)
Usage
NOF(dataset)
Arguments
dataset |
The dataset for which observations have a NOF score returned |
Details
NOF computes the nearest and reverse nearest neighborhood for observations, based on the natural neighborhood algorithm. Density is compared between observations and their neighbors. A kd-tree is used for kNN computation, using the kNN() function from the 'dbscan' package
Value
nb |
A vector of in-degrees for observations |
max_nb |
Maximum in-degree observations in nb vector. Used as k-parameter in outlier detection of NOF |
r |
The natural neighbor eigenvalue |
NOF |
A vector of Natural Outlier Factor scores. The greater the NOF, the greater the outlierness |
Author(s)
Jacob H. Madsen
References
Huang, J., Zhu, Q., Yang, L. & Feng, J. (2015). A non-parameter outlier detection algorithm based on Natural Neighbor. Knowledge-Based Systems. pp. 71-77. DOI: 10.1016/j.knosys.2015.10.014
Examples
# Select dataset
X <- iris[,1:4]
# Run NOF algorithm
outlier_score <- NOF(dataset=X)$NOF
# Sort and find index for most outlying observations
names(outlier_score) <- 1:nrow(X)
sort(outlier_score, decreasing = TRUE)
# Inspect the distribution of outlier scores
hist(outlier_score)