CirClust {OptCirClust} | R Documentation |
Circular Data Clustering
Description
Perform clustering on circular data to minimize the within-cluster sum of squared distances.
Usage
CirClust(O, K, Circumference, method = c("FOCC", "HEUC", "BOCC"))
Arguments
O |
a vector of circular data points. They can be coordinates along the circle based on distance, or angles around the circle. |
K |
the number of clusters |
Circumference |
the circumference of the circle where data are located |
method |
the circular clustering method.
|
Details
By circular data, we broadly refer to data points on any non-self-intersecting loop.
In clustering circular points into
clusters, the "FOCC" algorithm
is reproducible with runtime
(Debnath and Song 2021);
The "HEUC" algorithm, not always reproducible, calls the
kmeans
function repeatedly;
The "BOCC" algorithm with runtime , reproducible but slow, is done via
repeatedly calling the
Ckmeans.1d.dp
function.
Value
An object of class "CirClust"
which has a plot
method. It is a list with the following components:
cluster |
a vector of clusters assigned to each element in |
centers |
a numeric vector of the means for each cluster in the circular data. |
withinss |
a numeric vector of the within-cluster sum of squares for each cluster. |
size |
a vector of the number of elements in each cluster. |
totss |
the total sum of squared distances between each element and the sample mean. This statistic is not dependent on the clustering result. |
tot.withinss |
the total sum of within-cluster squared distances between each element and its cluster mean. This statistic is minimized given the number of clusters. |
betweenss |
the sum of squared distances between each cluster mean and sample mean. This statistic is maximized given the number of clusters. |
ID |
the starting index of the frame with minimum SSQ |
Border |
the borders of |
Border.mid |
the middle point of the last and first points of two consequitive clusters. |
O_name |
a character string. The actual name of the |
Circumference |
the circumfarence of the circular or periodic data. |
References
Debnath T, Song M (2021). “Fast optimal circular clustering and applications on round genomes.” IEEE/ACM Transactions on Computational Biology and Bioinformatics. doi: 10.1109/TCBB.2021.3077573.
Examples
O <- c(1,2, 10,11,12,13,14,15, 27,28,29,30,31,32, 40,41)
K <- 3
Circumference <- 42
# Perform circular clustering:
output <- CirClust(O, K, Circumference)
# Visualize the circular clusters:
opar <- par(mar=c(1,1,2,1))
plot(output)
par(opar)