x3p4c {odetector}R Documentation

Synthetic data set consists of three variables with four clusters

Description

A synthetic data set which was created by using the R package ‘MixSim’ (Melnykov et al, 2013). It consists of three continous variables forming four clusters. The last ten rows between Line 121 and 130 of the data set contains the outliers which are labeled as the class "0".

Usage

data(x3p4c)

Format

A data frame with 130 rows and 3 numeric variables:

p1

a numeric continous variable

p2

a numeric continous variable

p3

a numeric continous variable

cl

an integer variable containing the class labels. While the label 0 represents the generated outliers, the labels 1-4 stand for the classes of the clusters.

Note

The data set x3p4c is recommended to learn the outlier detection algorithms.

References

Melnykov, V., Chen,W-C. & Maitra, R. (2013). MixSim: An R package for simulating data to study performance of clustering algorithms. Journal of Statistical Software, 51(12):1-25.

Examples

data(x3p4c)
# Descriptive statistics of the data set
summary(x3p4c)
# Plot the data set
pairs(x3p4c[,-4], col=x3p4c[,4], pch=19, cex=2)

[Package odetector version 1.0.1 Index]