Outliers {moreparty} | R Documentation |
Computes outliers
Description
Computes outlierness scores and detects outliers.
Usage
Outliers(prox, cls=NULL, data=NULL, threshold=10)
Arguments
prox |
a proximity matrix (a square matrix with 1 on the diagonal and values between 0 and 1 in the off-diagonal positions). |
cls |
Factor. The classes the rows in the proximity matrix belong to. If NULL (default), all data are assumed to come from the same class. |
data |
A data frame of variables to describe the outliers (optional). |
threshold |
Numeric. The value of outlierness above which an observation is considered an outlier. Default is 10. |
Details
The outlierness score of a case is computed as n / sum(squared proximity), normalized by subtracting the median and divided by the MAD, within each class.
Value
A list with the following elements :
scores |
numeric vector containing the outlierness scores |
outliers |
numeric vector of indexes of the outliers, or a data frame with the outliers and their characteristics |
Note
The code is adapted from outlier
function in randomForest
package.
Examples
data(iris)
iris2 = iris
iris2$Species = factor(iris$Species == "versicolor")
iris.cf = party::cforest(Species ~ ., data = iris2,
control = party::cforest_unbiased(mtry = 2, ntree = 50))
prox=proximity(iris.cf)
Outliers(prox, iris2$Species, iris2[,1:4])