x3p4c {odetector} | R Documentation |
Synthetic data set consists of three variables with four clusters
Description
A synthetic data set which was created by using the R package ‘MixSim’ (Melnykov et al, 2013). It consists of three continous variables forming four clusters. The last ten rows between Line 121 and 130 of the data set contains the outliers which are labeled as the class "0".
Usage
data(x3p4c)
Format
A data frame with 130 rows and 3 numeric variables:
- p1
a numeric continous variable
- p2
a numeric continous variable
- p3
a numeric continous variable
- cl
an integer variable containing the class labels. While the label 0 represents the generated outliers, the labels 1-4 stand for the classes of the clusters.
Note
The data set x3p4c
is recommended to learn the outlier detection algorithms.
References
Melnykov, V., Chen,W-C. & Maitra, R. (2013). MixSim: An R package for simulating data to study performance of clustering algorithms. Journal of Statistical Software, 51(12):1-25.
Examples
data(x3p4c)
# Descriptive statistics of the data set
summary(x3p4c)
# Plot the data set
pairs(x3p4c[,-4], col=x3p4c[,4], pch=19, cex=2)