R: Conformity score for inductive prediction sets

icp.torus {ClusTorus}

R Documentation

Conformity score for inductive prediction sets

Description

icp.torus prepares all values for computing the conformity score for specified methods.

plot.icp.torus plots icp.torus object with some options.

Usage

icp.torus(
  data,
  split.id = NULL,
  model = c("kmeans", "kde", "mixture"),
  mixturefitmethod = c("axis-aligned", "circular", "general"),
  kmeansfitmethod = c("general", "homogeneous-circular", "heterogeneous-circular",
    "ellipsoids"),
  init = c("hierarchical", "kmeans"),
  d = NULL,
  additional.condition = TRUE,
  J = 4,
  concentration = 25,
  kmax = 500,
  THRESHOLD = 1e-10,
  maxiter = 200,
  verbose = TRUE,
  ...
)

## S3 method for class 'icp.torus'
logLik(object, ...)

## S3 method for class 'icp.torus'
predict(object, newdata, ...)

## S3 method for class 'icp.torus'
plot(
  x,
  data = NULL,
  level = 0.1,
  ellipse = TRUE,
  out = FALSE,
  type = NULL,
  ...
)

Arguments

`data`	n x d matrix of toroidal data on `[0, 2\pi)^d` or `[-\pi, \pi)^d`. Default is `NULL`.
`split.id`	a n-dimensional vector consisting of values 1 (estimation) and 2(evaluation)
`model`	A string. One of "kde", "mixture", and "kmeans" which determines the model or estimation methods. If "kde", the model is based on the kernel density estimates. It supports the kde-based conformity score only. If "mixture", the model is based on the von Mises mixture, fitted with an EM algorithm. It supports the von Mises mixture and its variants based conformity scores. If "kmeans", the model is also based on the von Mises mixture, but the parameter estimation is implemented with the elliptical k-means algorithm illustrated in Appendix. It supports the log-max-mixture based conformity score only. If the dimension of data space is greater than 2, only "kmeans" is supported. Default is `model = "kmeans"`.
`mixturefitmethod`	A string. One of "circular", "axis-aligned", and "general" which determines the constraint of the EM fitting. Default is "axis-aligned". This argument only works for `model = "mixture"`.
`kmeansfitmethod`	A string. One of "general", ellipsoids", "heterogeneous-circular" or "homogeneous-circular". If "general", the elliptical k-means algorithm with no constraint is used. If "ellipsoids", only the one iteration of the algorithm is used. If"heterogeneous-circular", the same as above, but with the constraint that ellipsoids must be spheres. If "homogeneous-circular", the same as above but the radii of the spheres are identical. Default is "general". This argument only works for `model = "kmeans"`.
`init`	Methods for choosing initial values of "kmeans" fitting. Must be "hierarchical" or "kmeans". If "hierarchical", the initial parameters are obtained with hierarchical clustering method. If "kmeans", the initial parameters are obtained with extrinsic k-means method. Additional arguments for k-means clustering and hierarchical clustering can be designated via argument `...`. If no options are designated, `nstart=1` for `init="kmeans"` and `method="complete"` for `init="hierarchical"` are used. Default is "hierarchical".
`d`	pairwise distance matrix(`dist` object) for `init = "hierarchical"`, which used in hierarchical clustering. If `init = "hierarchical"` and `d = NULL`, `d` will be automatically filled with `ang.pdist(data)`.
`additional.condition`	boolean index. If `TRUE`, a singular matrix will be altered to the scaled identity.
`J`	A scalar or numeric vector for the number(s) of components for `model = c("mixture", "kmeans")`. Default is `J = 4`.
`concentration`	A scalar or numeric vector for the concentration parameter(s) for `model = "kde"`. Default is `concentration = 25`.
`kmax`	the maximal number of kappa. If estimated kappa is larger than `kmax`, then put kappa as `kmax`.
`THRESHOLD`	number for difference between updating and updated parameters. Default is 1e-10.
`maxiter`	the maximal number of iteration. Default is 200.
`verbose`	boolean index, which indicates whether display additional details as to what the algorithm is doing or how many loops are done. Moreover, if `additional.condition` is `TRUE`, the warning message will be reported.
`...`	additional parameters. For plotting icp.torus, these parameters are for ggplot2::ggplot().
`object`	`icp.torus` object
`newdata`	n x d matrix of toroidal data on `[0, 2\pi)^d`. Dimension d must be the same as data used for `icp.torus` object.
`x`	`icp.torus` object
`level`	either a numeric scalar or a vector in `[0,1]`. Default value is 0.1.
`ellipse`	A boolean index which determines whether plotting ellipses from mixture models. Default is `TRUE`. (This option is used only when the `icp.torus` object `x` is fitted by model `kmeans` or `mixture`.)
`out`	An option for returning the ggplot object. Default is `FALSE`.
`type`	A string. One of "mix", "max" or "e". This argument is only available if `icp.torus` object is fitted with `model = "mixture"`. Default is `NULL`. If `type != NULL`, argument `ellipse` automatically becomes `FALSE`. If "mix", it plots based on von Mises mixture. If "max", it plots based on von Mises max-mixture. If "e", it plots based on ellipse-approximation.

Value

icp.torus returns an icp.torus object, containing all values to compute the conformity score (if J or concentration is a single value). if J or concentration is a vector containing multiple values, then icp.torus returns a list of icp.torus objects

References

Jung, S., Park, K., & Kim, B. (2021). Clustering on the torus by conformal prediction. The Annals of Applied Statistics, 15(4), 1583-1603.

Mardia, K. V., Kent, J. T., Zhang, Z., Taylor, C. C., & Hamelryck, T. (2012). Mixtures of concentrated multivariate sine distributions with applications to bioinformatics. Journal of Applied Statistics, 39(11), 2475-2492.

Di Marzio, M., Panzera, A., & Taylor, C. C. (2011). Kernel density estimation on the torus. Journal of Statistical Planning and Inference, 141(6), 2156-2173.

Shin, J., Rinaldo, A., & Wasserman, L. (2019). Predictive clustering. arXiv preprint arXiv:1903.08125.

Examples


data <- toydata1[, 1:2]

icp.torus <- icp.torus(data, model = "kmeans",
                       kmeansfitmethod = "general",
                       J = 4, concentration = 25)

[Package ClusTorus version 0.2.2 Index]