R: HTK-Means Clustering

HTKmeans {clusterHD}

R Documentation

HTK-Means Clustering

Description

Perform HTK-means clustering (Raymaekers and Zamar, 2022) on a data matrix.

Usage

HTKmeans(X, k, lambdas = NULL,
         standardize = TRUE,
         iter.max = 100, nstart = 100,
         nlambdas = 50,
         lambda_max = 1,
         verbose = FALSE)

Arguments

`X`	a matrix containing the data.
`k`	the number of clusters.
`lambdas`	a vector of values for the regularization parameter `lambda`. Defaults to `NULL`, which generates a sequence of values automatically.
`standardize`	logical flag for standardization to mean 0 and variance 1 of the data in `X`. This is recommended, unless the variance of the variables is known to quantify relevant information.
`iter.max`	the maximum number of iterations allowed.
`nstart`	number of starts used when k-means is applied to generate the starting values for HTK-means. See below for more info.
`nlambdas`	Number of lambda values to generate automatically.
`lambda_max`	Maximum value for the regularization paramater `lambda`. If `standardize = TRUE`, the default of 1 works well.
`verbose`	Whether or not to print progress. Defaults to `FALSE`.

Details

The algorithm starts by generating a number of sparse starting values. This is done using k-means on subsets of variables. See Raymaekers and Zamar (2022) for details.

Value

A list with components:

HTKmeans.out
A list with length equal to the number of lambda values supplied in lambdas. Each element of this list is in turn a list containing

centers A matrix of cluster centres.

cluster A vector of integers (from 1:k) indicating the cluster to which each point is allocated.

itnb The number of iterations executed until convergence

converged Whether the algorithm stopped by converging or through reaching the maximum number of itertions.
inputargs
the input arguments to the function.

Author(s)

J. Raymaekers and R.H. Zamar

References

Raymaekers, Jakob, and Ruben H. Zamar. "Regularized K-means through hard-thresholding." arXiv preprint arXiv:2010.00950 (2020).

Examples

X <- iris[, 1:4]
HTKmeans.out <- HTKmeans(X, k = 3, lambdas = 0.8)
HTKmeans.out[[1]]$centers
pairs(X, col = HTKmeans.out[[1]]$cluster)

[Package clusterHD version 1.0.2 Index]