R: Functional K-Means Clustering by Abraham et al. (2003)

funkmeans03A {T4cluster}

R Documentation

Functional K-Means Clustering by Abraham et al. (2003)

Description

Given N curves \gamma_1 (t), \gamma_2 (t), \ldots, \gamma_N (t) : I \rightarrow \mathbf{R}, perform k-means clustering on the coefficients from the functional data expanded by B-spline basis. Note that in the original paper, authors used B-splines as the choice of basis due to nice properties. However, we allow other types of basis as well for convenience.

Usage

funkmeans03A(fdobj, k = 2, ...)

Arguments

fdobj

a 'fd' functional data object of N curves by the fda package.

k

the number of clusters (default: 2).

...

extra parameters including

maxiter: the maximum number of iterations (default: 10).
nstart: the number of random initializations (default: 5).

Value

a named list of S3 class T4cluster containing

cluster: a length-N vector of class labels (from 1:k).
mean: a 'fd' object of k mean curves.
algorithm: name of the algorithm.

References

Abraham C, Cornillon PA, Matzner-Lober E, Molinari N (2003). “Unsupervised Curve Clustering Using B-Splines.” Scandinavian Journal of Statistics, 30(3), 581–595. ISSN 0303-6898, 1467-9469.

Examples

# -------------------------------------------------------------
#                     two types of curves
#
# type 1 : sin(x) + perturbation; 20 OF THESE ON [0, 2*PI]
# type 2 : cos(x) + perturbation; 20 OF THESE ON [0, 2*PI]
# type 3 : sin(x) + cos(0.5x)   ; 20 OF THESE ON [0, 2*PI]
# -------------------------------------------------------------
## PREPARE : USE 'fda' PACKAGE
#  Generate Raw Data
datx = seq(from=0, to=2*pi, length.out=100)
daty = array(0,c(100, 60))
for (i in 1:20){
  daty[,i]    = sin(datx) + rnorm(100, sd=0.5)
  daty[,i+20] = cos(datx) + rnorm(100, sd=0.5)
  daty[,i+40] = sin(datx) + cos(0.5*datx) + rnorm(100, sd=0.5)
}
#  Wrap as 'fd' object
mybasis <- fda::create.bspline.basis(c(0,2*pi), nbasis=10)
myfdobj <- fda::smooth.basis(datx, daty, mybasis)$fd

## RUN THE ALGORITHM WITH K=2,3,4
fk2 = funkmeans03A(myfdobj, k=2)
fk3 = funkmeans03A(myfdobj, k=3)
fk4 = funkmeans03A(myfdobj, k=4)

## FUNCTIONAL PCA FOR VISUALIZATION
embed = fda::pca.fd(myfdobj, nharm=2)$score

## VISUALIZE
opar <- par(no.readonly=TRUE)
par(mfrow=c(1,3))
plot(embed, col=fk2$cluster, pch=19, main="K=2")
plot(embed, col=fk3$cluster, pch=19, main="K=3")
plot(embed, col=fk4$cluster, pch=19, main="K=4")
par(opar)

[Package T4cluster version 0.1.2 Index]