Outliers {moreparty}R Documentation

Computes outliers

Description

Computes outlierness scores and detects outliers.

Usage

  Outliers(prox, cls=NULL, data=NULL, threshold=10)

Arguments

prox

a proximity matrix (a square matrix with 1 on the diagonal and values between 0 and 1 in the off-diagonal positions).

cls

Factor. The classes the rows in the proximity matrix belong to. If NULL (default), all data are assumed to come from the same class.

data

A data frame of variables to describe the outliers (optional).

threshold

Numeric. The value of outlierness above which an observation is considered an outlier. Default is 10.

Details

The outlierness score of a case is computed as n / sum(squared proximity), normalized by subtracting the median and divided by the MAD, within each class.

Value

A list with the following elements :

scores

numeric vector containing the outlierness scores

outliers

numeric vector of indexes of the outliers, or a data frame with the outliers and their characteristics

Note

The code is adapted from outlier function in randomForest package.

Examples

  data(iris)
  iris2 = iris
  iris2$Species = factor(iris$Species == "versicolor")
  iris.cf = party::cforest(Species ~ ., data = iris2,
            control = party::cforest_unbiased(mtry = 2, ntree = 50))
  prox=proximity(iris.cf)
  Outliers(prox, iris2$Species, iris2[,1:4])

[Package moreparty version 0.4 Index]