fdakmeans {fdacluster} | R Documentation |
Performs k-means clustering for functional data with amplitude and phase separation
Description
This function provides implementations of the k-means clustering algorithm for functional data, with possible joint amplitude and phase separation. A number of warping class are implemented to achieve this separation.
Usage
fdakmeans(
x,
y = NULL,
n_clusters = 1L,
seeds = NULL,
seeding_strategy = c("kmeans++", "exhaustive-kmeans++", "exhaustive", "hclust"),
warping_class = c("affine", "dilation", "none", "shift", "srsf"),
centroid_type = "mean",
metric = c("l2", "pearson"),
cluster_on_phase = FALSE,
use_verbose = TRUE,
warping_options = c(0.15, 0.15),
maximum_number_of_iterations = 100L,
number_of_threads = 1L,
parallel_method = 0L,
distance_relative_tolerance = 0.001,
use_fence = FALSE,
check_total_dissimilarity = TRUE,
compute_overall_center = FALSE,
add_silhouettes = TRUE
)
Arguments
x |
A numeric vector of length |
y |
Either a numeric matrix of shape |
n_clusters |
An integer value specifying the number of clusters.
Defaults to |
seeds |
An integer value or vector specifying the indices of the initial
centroids. If an integer vector, it is interpreted as the indices of the
intial centroids and should therefore be of length |
seeding_strategy |
A character string specifying the strategy for
choosing the initial centroids in case the argument |
warping_class |
A string specifying the warping class Choices are
|
centroid_type |
A string specifying the type of centroid to compute.
Choices are |
metric |
A string specifying the metric used to compare curves. Choices
are |
cluster_on_phase |
A boolean specifying whether clustering should be
based on phase variation or amplitude variation. Defaults to |
use_verbose |
A boolean specifying whether the algorithm should output
details of the steps to the console. Defaults to |
warping_options |
A numeric vector supplied as a helper to the chosen
|
maximum_number_of_iterations |
An integer specifying the maximum number
of iterations before the algorithm stops if no other convergence criterion
was met. Defaults to |
number_of_threads |
An integer value specifying the number of threads
used for parallelization. Defaults to |
parallel_method |
An integer value specifying the type of desired
parallelization for template computation, If |
distance_relative_tolerance |
A numeric value specifying a relative
tolerance on the distance update between two iterations. If all
observations have not sufficiently improved in that sense, the algorithm
stops. Defaults to |
use_fence |
A boolean specifying whether the fence algorithm should be
used to robustify the algorithm against outliers. Defaults to |
check_total_dissimilarity |
A boolean specifying whether an additional
stopping criterion based on improvement of the total dissimilarity should
be used. Defaults to |
compute_overall_center |
A boolean specifying whether the overall center
should be also computed. Defaults to |
add_silhouettes |
A boolean specifying whether silhouette values should
be computed for each observation for internal validation of the clustering
structure. Defaults to |
Value
An object of class caps
.
Examples
#----------------------------------
# Extracts 15 out of the 30 simulated curves in `simulated30_sub` data set
idx <- c(1:5, 11:15, 21:25)
x <- simulated30_sub$x[idx, ]
y <- simulated30_sub$y[idx, , ]
#----------------------------------
# Runs a k-means clustering with affine alignment, searching for 2 clusters
out <- fdakmeans(
x = x,
y = y,
n_clusters = 2,
warping_class = "affine"
)
#----------------------------------
# Then visualize the results
# Either with ggplot2 via ggplot2::autoplot(out)
# or using graphics::plot()
# You can visualize the original and aligned curves with:
plot(out, type = "amplitude")
# Or the estimated warping functions with:
plot(out, type = "phase")