apd_isolation {applicable} | R Documentation |
apd_isolation()
fits an isolation forest model.
apd_isolation(x, ...)
## Default S3 method:
apd_isolation(x, ...)
## S3 method for class 'data.frame'
apd_isolation(x, ...)
## S3 method for class 'matrix'
apd_isolation(x, ...)
## S3 method for class 'formula'
apd_isolation(formula, data, ...)
## S3 method for class 'recipe'
apd_isolation(x, data, ...)
x |
Depending on the context:
|
... |
Options to pass to |
formula |
A formula specifying the predictor terms on the right-hand side. No outcome should be specified. |
data |
When a recipe or formula is used,
|
In an isolation forest, splits are designed to isolate individual data points. The tree construction process takes random split locations on randomly selected predictors. As splits are made in the tree, the algorithm tracks when data points are isolated as more splits are made. The first points that are isolated are thought to be outliers or anomalous. From these results, an anomaly score can be constructed.
This function creates an isolation forest on the training set and measures the reference distribution of the scores when re-predicting the training set. When scoring new data, the raw anomaly score is produced along with the sample's corresponding percentile of the reference distribution.
A apd_isolation
object.
Liu, Fei Tony, Kai Ming Ting, and Zhi-Hua Zhou. "Isolation forest." 2008 Eighth IEEE International Conference on Data Mining. IEEE, 2008. Liu, Fei Tony, Kai Ming Ting, and Zhi-Hua Zhou. "Isolation-based anomaly detection." ACM Transactions on Knowledge Discovery from Data (TKDD) 6.1 (2012): 3.
if (rlang::is_installed(c("isotree", "modeldata"))) {
library(dplyr)
data(cells, package = "modeldata")
cells_tr <- cells %>% filter(case == "Train") %>% select(-case, -class)
cells_te <- cells %>% filter(case != "Train") %>% select(-case, -class)
if_mod <- apd_isolation(cells_tr, ntrees = 10, nthreads = 1)
if_mod
}