robinden {ktaucenters}R Documentation

Robust Initialization based on Inverse Density estimator (ROBINDEN)

Description

Searches for k initial cluster seeds for k-means based clustering methods.

Usage

robinden(D, n_clusters, mp)

Arguments

D

a distance matrix, which contains the distances between the rows of a matrix.

n_clusters

number of cluster centers to find.

mp

number of nearest neighbors to compute point density.

Details

The centers are the observations located in the most dense region and far away from each other at the same time. In order to find the observations in the highly dense region, this function uses point density estimation (instead of Local Outlier Factor, Breunig et al (2000)), see more details.

Value

A list with the following components:

centers

: A numeric vector with the initial cluster centers indices.

idpoints

: A real vector containing the inverse of point density estimation.

Note

This is a slightly modified version of ROBIN algorithm implementation done by Sarka Brodinova <sarka.brodinova@tuwien.ac.at>.

Author(s)

Juan Domingo Gonzalez <juanrst@hotmail.com>

References

Hasan AM, et al. Robust partitional clustering by outlier and density insensitive seeding. Pattern Recognition Letters, 30(11), 994-1002, 2009.

Examples

# Generate synthetic data (7 cluster well separated)
K <- 5
nk <- 100
Z <- rnorm(2 * K * nk)
mues <- rep(5 * -floor(K/2):floor(K/2), 2 * nk * K)
X <-  matrix(Z + mues, ncol = 2)

# Generate synthetic outliers (contamination level 20%)
X[sample(1:(nk * K), (nk * K) * 0.2), ] <-
  matrix(runif((nk * K) * 0.2 * 2, 3 * min(X), 3 * max(X)),
         ncol = 2,
         nrow = (nk * K)* 0.2)
res <- robinden(D = as.matrix(dist(X)), n_clusters = K, mp = 10);
# plot the Initial centers found
plot(X)
points(X[res$centers, ], pch = 19, col = 4, cex = 2)


[Package ktaucenters version 1.0.0 Index]