kmeans_align {fdasrvf}R Documentation

K-Means Clustering and Alignment

Description

This function clusters functions and aligns using the elastic square-root velocity function (SRVF) framework.

Usage

kmeans_align(
  f,
  time,
  K = 1L,
  seeds = NULL,
  centroid_type = c("mean", "medoid"),
  nonempty = 0L,
  lambda = 0,
  showplot = FALSE,
  smooth_data = FALSE,
  sparam = 25L,
  parallel = FALSE,
  alignment = TRUE,
  rotation = FALSE,
  scale = TRUE,
  omethod = c("DP", "RBFGS"),
  max_iter = 50L,
  thresh = 0.01,
  use_verbose = FALSE
)

Arguments

f

Either a numeric matrix or a numeric 3D array specifying the functions that need to be jointly clustered and aligned.

  • If a matrix, it must be of shape M \times N. In this case, it is interpreted as a sample of N curves observed on a grid of size M.

  • If a 3D array, it must be of shape L \times M \times N and it is interpreted as a sample of N L-dimensional curves observed on a grid of size M.

If this is multidimensional functional data, it is advised that rotation==FALSE

time

A numeric vector of length M specifying the grid on which the curves are evaluated.

K

An integer value specifying the number of clusters. Defaults to 1L.

seeds

An integer vector of length K specifying the indices of the curves in f which will be chosen as initial centroids. Defaults to NULL in which case such indices are randomly chosen.

centroid_type

A string specifying the type of centroid to compute. Choices are "mean" or "medoid". Defaults to "mean".

nonempty

An integer value specifying the minimum number of curves per cluster during the assignment step. Set it to a positive value to avoid the problem of empty clusters. Defaults to 0L.

lambda

A numeric value specifying the elasticity. Defaults to 0.0.

showplot

A Boolean specifying whether to show plots. Defaults to FALSE.

smooth_data

A Boolean specifying whether to smooth data using a box filter. Defaults to FALSE.

sparam

An integer value specifying the number of box filters applied. Defaults to 25L.

parallel

A Boolean specifying whether parallel mode (using foreach::foreach() and the doParallel package) should be activated. Defaults to FALSE.

alignment

A Boolean specifying whether to perform alignment. Defaults to TRUE.

rotation

A Boolean specifying whether to perform rotation. Defaults to to FALSE.

scale

A Boolean specifying whether to scale curves to unit length. Defaults to TRUE.

omethod

A string specifying which method should be used to solve the optimization problem that provides estimated warping functions. Choices are "DP" or "RBFGS". Defaults to "DP".

max_iter

An integer value specifying the maximum number of iterations. Defaults to 50L.

thresh

A numeric value specifying a threshold on the cost function below which convergence is assumed. Defaults to 0.01.

use_verbose

A Boolean specifying whether to display information about the calculations in the console. Defaults to FALSE.

Value

An object of class fdakma which is a list containing:

References

Srivastava, A., Wu, W., Kurtek, S., Klassen, E., Marron, J. S., May 2011. Registration of functional data using Fisher-Rao metric, arXiv:1103.3817v2.

Tucker, J. D., Wu, W., Srivastava, A., Generative models for functional data using phase and amplitude separation, Computational Statistics and Data Analysis (2012), 10.1016/j.csda.2012.12.001.

Sangalli, L. M., et al. (2010). "k-mean alignment for curve clustering." Computational Statistics & Data Analysis 54(5): 1219-1233.

Examples

## Not run: 
  out <- kmeans_align(growth_vel$f, growth_vel$time, K = 2)

## End(Not run)

[Package fdasrvf version 2.2.0 Index]