pdkMeans {pdSpecEst} | R Documentation |
K-means clustering for HPD matrices
Description
pdkMeans
performs (fuzzy) k-means clustering for collections of HPD matrices, such as covariance or
spectral density matrices, based on a number of different metrics in the space of HPD matrices.
Usage
pdkMeans(X, K, metric = "Riemannian", m = 1, eps = 1e-05,
max_iter = 100, centroids)
Arguments
X |
a ( |
K |
the number of clusters, a positive integer larger than 1. |
metric |
the metric that the space of HPD matrices is equipped with. The default choice is |
m |
a fuzziness parameter larger or equal to |
eps |
an optional tolerance parameter determining the stopping criterion. The k-means algorithm
terminates if the intrinsic distance between cluster centers is smaller than |
max_iter |
an optional parameter tuning the maximum number of iterations in the
k-means algorithm, defaults to |
centroids |
an optional ( |
Details
The input array X
corresponds to a collection of (d,d)
-dimensional HPD matrices
for S
different subjects. If the fuzziness parameter satisfies m > 1
, the S
subjects are assigned to
K
different clusters in a probabilistic fashion according to a fuzzy k-means algorithm as detailed in classical texts,
such as (Bezdek 1981). If m = 1
, the S
subjects are assigned to the K
clusters in a non-probabilistic
fashion according to a standard (hard) k-means algorithm. If not specified by the user, the K
cluster
centers are initialized by random sampling without replacement from the input array of HPD matrices X
.
The distance measure in the (fuzzy) k-means algorithm is induced by the metric on the space of HPD matrices specified by the user.
By default, the space of HPD matrices is equipped with (i) the affine-invariant Riemannian metric (metric = 'Riemannian'
)
as detailed in e.g., (Bhatia 2009)[Chapter 6] or (Pennec et al. 2006). Instead, this can also be one of:
(ii) the log-Euclidean metric (metric = 'logEuclidean'
), the Euclidean inner product between matrix logarithms;
(iii) the Cholesky metric (metric = 'Cholesky'
), the Euclidean inner product between Cholesky decompositions; (iv) the
Euclidean metric (metric = 'Euclidean'
); or (v) the root-Euclidean metric (metric = 'rootEuclidean'
). The default
choice of metric (affine-invariant Riemannian) satisfies several useful properties not shared by the other metrics, see e.g.,
C18pdSpecEst for more details. Note that this comes at the cost of increased computation time in comparison to one
of the other metrics.
Value
Returns a list with two components:
- cl.assignments
an (
S,K
)-dimensional matrix, where the value at position (s,k
) in the matrix corresponds to the (probabilistic or binary) cluster membership assignment of subjects
with respect to clusterk
.- cl.centroids
either a (
d,d,K
)- or (d,d,n,K
)-dimensional array depending on the input arrayX
corresponding respectively to theK
(d,d)
- or (d,d,n
)-dimensional final cluster centroids.
References
Bezdek J (1981).
Pattern Recognition with Fuzzy Objective Function Algorithms.
Plenum Press, New York.
Bhatia R (2009).
Positive Definite Matrices.
Princeton University Press, New Jersey.
Pennec X, Fillard P, Ayache N (2006).
“A Riemannian framework for tensor computing.”
International Journal of Computer Vision, 66(1), 41–66.
See Also
pdDist
, pdSpecClust1D
, pdSpecClust2D
Examples
## Generate 20 random HPD matrices in 2 groups
m <- function(rescale){
x <- matrix(complex(real = rescale * rnorm(9), imaginary = rescale * rnorm(9)), nrow = 3)
t(Conj(x)) %*% x
}
X <- array(c(replicate(10, m(0.25)), replicate(10, m(1))), dim = c(3, 3, 20))
## Compute fuzzy k-means cluster assignments
cl <- pdkMeans(X, K = 2, m = 2)$cl.assignments