DSC_StreamKM {streamMOA} | R Documentation |
streamKM++
Description
This is an interface to the MOA implementation of streamKM++.
Usage
DSC_StreamKM(sizeCoreset = 10000, numClusters = 5, length = 100000L, ...)
Arguments
sizeCoreset |
Size of the coreset |
numClusters |
Number of clusters to compute |
length |
Length of the data stream |
... |
Further arguments ignored. |
Details
streamKM++ uses a tree-based sampling strategy to obtain a small weighted sample of the stream called coreset. The MOA implementation applies the k-means++ algorithm to find a given number of centers in the coreset.
Notes:
The cluster can only cluster the number of points specified in
length
ans then produces anArrayIndexOutOfBoundsException
error.The coreset (micro-clusters are not accessible), only the macro-clusters can be requested.
Author(s)
Matthias Carnein
References
Marcel R. Ackermann, Christiane Lammersen, Marcus Maertens, Christoph Raupach, Christian Sohler, Kamil Swierkot. StreamKM++: A Clustering Algorithm for Data Streams. In: Proceedings of the 12th Workshop on Algorithm Engineering and Experiments (ALENEX '10), 2010.
See Also
Other DSC_MOA:
DSC_BICO_MOA()
,
DSC_CluStream()
,
DSC_ClusTree()
,
DSC_DStream_MOA()
,
DSC_DenStream()
,
DSC_MCOD()
,
DSC_MOA()
Examples
set.seed(1000)
stream <- DSD_Gaussians(k = 3, d = 2, noise = 0.05)
# cluster with streamKM++
streamkm <- DSC_StreamKM(sizeCoreset = 100, numClusters = 3, length = 1000)
update(streamkm, stream, 100)
streamkm
# plot macro-clusters (no access to micro-clusters)
plot(streamkm, stream)