inckmed {kmed} | R Documentation |
Increasing number of clusters in k-medoids algorithm
Description
This function runs the increasing number of clusters in the k-medoids algorithm proposed by Yu et. al. (2018).
Usage
inckmed(distdata, ncluster, iterate = 10, alpha = 1)
Arguments
distdata |
A distance matrix (n x n) or dist object. |
ncluster |
A number of clusters. |
iterate |
A number of iterations for the clustering algorithm. |
alpha |
A stretch factor to determine the range of initial medoid selection (see Details). |
Details
This algorithm is claimed to manage with the weakness of the
simple and fast-kmedoids (fastkmed
). The origin of the
algorithm is a centroid-based algorithm by applying the Euclidean distance.
Then, Bbecause the function is a medoid-based algorithm, the object mean
(centroid) and variance are redefined into medoid and deviation, respectively.
The alpha
argument is a stretch factor, i.e. a constant defined by
the user. It is applied to determine a set of medoid candidates. The medoid
candidates are calculated by
O_c =
{X_i
| \sigma_i \leq \alpha \sigma,
i = 1, 2, \ldots, n
},
where \sigma_i
is the average deviation of object i, and
\sigma
is the average deviation of the data set. They are computed by
\sigma = \sqrt{\frac{1}{n-1} \sum_{i=1}^n d(O_i, v_1)}
\sigma_i = \sqrt{\frac{1}{n-1} \sum_{i=1}^n d(O_i, O_j)}
where n is the number of objects, O_i
is the object i,
and v_1
is the most centrally located object.
Value
Function returns a list of components:
cluster
is the clustering memberships result.
medoid
is the id medoids.
minimum_distance
is the distance of all objects to their cluster
medoid.
Author(s)
Weksi Budiaji
Contact: budiaji@untirta.ac.id
References
Yu, D., Liu, G., Guo, M., Liu, X., 2018. An improved K-medoids algorithm based on step increasing and optimizing medoids. Expert Systems with Applications 92, pp. 464-473.
Examples
num <- as.matrix(iris[,1:4])
mrwdist <- distNumeric(num, num, method = "mrw")
result <- inckmed(mrwdist, ncluster = 3, iterate = 50, alpha = 1.5)
table(result$cluster, iris[,5])