clus.torus {ClusTorus} | R Documentation |
Clustering on the torus by conformal prediction
Description
clus.torus
returns clustering results of data on the torus based on
inductive conformal prediction set
Usage
clus.torus(
data,
split.id = NULL,
model = c("kmeans", "mixture"),
mixturefitmethod = c("axis-aligned", "circular", "general"),
kmeansfitmethod = c("general", "homogeneous-circular", "heterogeneous-circular",
"ellipsoids"),
J = NULL,
level = NULL,
option = NULL,
verbose = TRUE,
...
)
## S3 method for class 'clus.torus'
plot(
x,
panel = 1,
assignment = "outlier",
data = NULL,
ellipse = TRUE,
type = NULL,
overlay = FALSE,
out = FALSE,
...
)
Arguments
data |
n x d matrix of toroidal data on |
split.id |
a n-dimensional vector consisting of values 1 (estimation) and 2(evaluation) |
model |
A string. One of "mixture" and "kmeans" which
determines the model or estimation methods. If "mixture", the model is based
on the von Mises mixture, fitted
with an EM algorithm. It supports the von Mises mixture and its variants
based conformity scores. If "kmeans", the model is also based on the von
Mises mixture, but the parameter estimation is implemented with the
elliptical k-means algorithm. It supports the
log-max-mixture based conformity score only. If the
dimension of data space is greater than 2, only "kmeans" is supported.
Default is |
mixturefitmethod |
A string. One of "circular", "axis-aligned", and
"general" which determines the constraint of the EM fitting. Default is
"axis-aligned". This argument only works for |
kmeansfitmethod |
A string. One of "general", ellipsoids",
"heterogeneous-circular" or "homogeneous-circular". If "general", the
elliptical k-means algorithm with no constraint is used. If "ellipsoids",
only the one iteration of the algorithm is used. If"heterogeneous-circular",
the same as above, but with the constraint that ellipsoids must be spheres.
If "homogeneous-circular", the same as above but the radii of the spheres are
identical. Default is "general". This argument only works for |
J |
the number of components for mixture model fitting. If |
level |
a scalar in |
option |
A string. One of "elbow", "risk", "AIC", or "BIC", which determines the
criterion for the model selection. "risk" is based on the negative log-likelihood, "AIC"
for the Akaike Information Criterion, and "BIC" for the Bayesian Information Criterion.
"elbow" is based on minimizing the criterion used in Jung et. al.(2021).
This argument is only used if |
verbose |
boolean index, which indicates whether display
additional details as to what the algorithm is doing or
how many loops are done. Default is |
... |
Further arguments that will be passed to |
x |
|
panel |
One of 1 or 2 which determines the type of plot. If |
assignment |
A string. One of "outlier", "log.density", "posterior", "mahalanobis". Default is "outlier". |
ellipse |
A boolean index which determines whether plotting ellipse-intersections. Default is |
type |
A string. One of "mix", "max" or "e". This argument is only available if |
overlay |
A boolean index which determines whether plotting ellipse-intersections on clustering plots. Default is |
out |
An option for returning the ggplot object. Default is |
Details
clus.torus
is a user-friendly all-in-one function which implements following
procedures automatically: 1. compute conformity scores for given model and fitting method,
2. choose optimal model and level based on prespecified criterion, and
3. make clusters based on the chosen model and level. Procedure 1-3 can be
independently done with icp.torus
, hyperparam.torus
,
hyperparam.J
, hyperparam.alpha
and cluster.assign.torus
.
If you want to see more detail for each procedure, please see
icp.torus
, hyperparam.J
, hyperparam.alpha
hyperparam.torus
, cluster.assign.torus
.
Value
clus.torus
returns a clus.torus
object, which consists of following 3 different S3 objects;
cluster.obj
cluster.obj
object; clustering assignment results for several methods. For detail, seecluster.assign.torus
.icp.torus
icp.torus
object; containing model parameters and conformity scores. For detail, seeicp.torus
.hyperparam.select
hyperparam.torus
object (ifJ = NULL
or a sequence of numbers, andlevel = NULL
or a sequence of numbers),hyperparam.J
object (iflevel
is a scalar), orhyperparam.alpha
object (ifJ
is a scalar); contains information for the optimally chosen model (number of components J) and level (alpha) based on prespecified criterion. For detail, seehyperparam.torus
,hyperparam.J
, andhyperparam.alpha
.
References
Jung, S., Park, K., & Kim, B. (2021). Clustering on the torus by conformal prediction. The Annals of Applied Statistics, 15(4), 1583-1603.
Mardia, K. V., Kent, J. T., Zhang, Z., Taylor, C. C., & Hamelryck, T. (2012). Mixtures of concentrated multivariate sine distributions with applications to bioinformatics. Journal of Applied Statistics, 39(11), 2475-2492.
Shin, J., Rinaldo, A., & Wasserman, L. (2019). Predictive clustering. arXiv preprint arXiv:1903.08125.
See Also
icp.torus
, hyperparam.torus
,
hyperparam.J
, hyperparam.alpha
cluster.assign.torus
Examples
data <- toydata2[, 1:2]
n <- nrow(data)
clus.torus(data = data, model = "kmeans", kmeansfitmethod = "general", J = 5:30, option = "risk")