mscseek {inaparc} | R Documentation |
Initialization of cluster prototypes using the modified SCS algorithm
Description
Initializes the cluster prototypes matrix using a modified version of the Simple Cluster Seeking (SCS) algorithm proposed by Tou & Gonzales(1974). While SCS uses a fixed threshold distance value T for selecting all candidates of clusters, the modified SCS recomputes T with the average Euclidean distances between the previously determined prototypes. This adjustment makes possible to select more cluster prototypes when compared to SCS.
Usage
mscseek(x, k, tv)
Arguments
x |
a numeric vector, data frame or matrix. |
k |
an integer for the number of clusters. |
tv |
a number to be used as the threshold distance which is directly input by the user. Also it is possible to compute T, a threshold distance value with the following options of
|
Details
This is a modification of the Simple Cluster Seeking (SCS) algorithm (Tou & Gonzalez, 1974). The algorithm selects the first object in the data set as the prototype of the first cluster. Then, next object whose distance to the first prototype is greater than a threshold distance value is searched and assigned as the second cluster prototype. Instead of using a fixed the T, threshold distance value as SCS does, the modified SCS recomputes the T by the average Euclidean distances between the previously determined prototypes of clusters. The next object whose distance to the previously selected object is greater than the adjusted T is searched and assigned as the third cluster prototype. The selection process is repeated for the remaining clusters in similar way. The method is sensitive to the order of the data, it may not yield good initializations with the ordered data.
Value
an object of class ‘inaparc’, which is a list consists of the following items:
v |
a numeric matrix of the initial cluster prototypes. |
ctype |
a string representing the type of centroid, which used to build prototype matrix. Its value is ‘obj’ with this function because the cluster prototype matrix contains the objects. |
call |
a string containing the matched function call that generates the object ‘inaparc’. |
Author(s)
Zeynel Cebeci, Cagatay Cebeci
References
Tou, J.T. & Gonzalez, R.C. (1974). Pattern Recognition Principles. Addison-Wesley, Reading, MA. <ISBN:9780201075861>
See Also
aldaoud
,
ballhall
,
crsamp
,
firstk
,
forgy
,
hartiganwong
,
inofrep
,
inscsf
,
insdev
,
kkz
,
kmpp
,
ksegments
,
ksteps
,
lastk
,
lhsmaximin
,
lhsrandom
,
maximin
,
rsamp
,
rsegment
,
scseek
,
scseek2
,
spaeth
,
ssamp
,
topbottom
,
uniquek
,
ursamp
Examples
data(iris)
# Run with the threshold value of 0.1
res <- mscseek(x=iris[,1:4], k=5, tv=0.1)
v1 <- res$v
print(v1)
# Run with the internally computed default threshold value
res <- mscseek(x=iris[,1:4], k=5)
v2 <- res$v
print(v2)