qtclust {flexclust} | R Documentation |
Stochastic QT Clustering
Description
Perform stochastic QT clustering on a data matrix.
Usage
qtclust(x, radius, family = kccaFamily("kmeans"), control = NULL,
save.data=FALSE, kcca=FALSE)
Arguments
x |
A numeric matrix of data, or an object that can be coerced to such a matrix (such as a numeric vector or a data frame with all numeric columns). |
radius |
Maximum radius of clusters. |
family |
Object of class |
control |
An object of class |
.
save.data |
Save a copy of |
kcca |
Run |
Details
This function implements a variation of the QT clustering algorithm by
Heyer et al. (1999), see Scharl and Leisch (2006). The main difference
is that in each iteration not
all possible cluster start points are considered, but only a random
sample of size control@ntry
. We also consider only points as initial
centers where at least one other point is within a circle with radius
radius
. In most cases the resulting
solutions are almost
the same at a considerable speed increase, in some cases even better
solutions are obtained than with the original algorithm. If
control@ntry
is set to the size of the data set, an algorithm
similar to the original algorithm as proposed by Heyer et al. (1999)
is obtained.
Value
Function qtclust
by default returns objects of class
"kccasimple"
. If argument kcca
is TRUE
, function
kcca()
is run afterwards (initialized on the QT cluster
solution). Data points
not clustered by the QT cluster algorithm are omitted from the
kcca()
iterations, but filled back into the return
object. All plot methods defined for objects of class "kcca"
can be used.
Author(s)
Friedrich Leisch
References
Heyer, L. J., Kruglyak, S., Yooseph, S. (1999). Exploring expression data: Identification and analysis of coexpressed genes. Genome Research 9, 1106–1115.
Theresa Scharl and Friedrich Leisch. The stochastic QT-clust algorithm: evaluation of stability and variance on time-course microarray data. In Alfredo Rizzi and Maurizio Vichi, editors, Compstat 2006 – Proceedings in Computational Statistics, pages 1015-1022. Physica Verlag, Heidelberg, Germany, 2006.
Examples
x <- matrix(10*runif(1000), ncol=2)
## maximum distrance of point to cluster center is 3
cl1 <- qtclust(x, radius=3)
## maximum distrance of point to cluster center is 1
## -> more clusters, longer runtime
cl2 <- qtclust(x, radius=1)
opar <- par(c("mfrow","mar"))
par(mfrow=c(2,1), mar=c(2.1,2.1,1,1))
plot(x, col=predict(cl1), xlab="", ylab="")
plot(x, col=predict(cl2), xlab="", ylab="")
par(opar)