protocut {protoclust} | R Documentation |
Cut a Minimax Linkage Tree To Get a Clustering
Description
Cuts a minimax linkage tree to get one of n - 1 clusterings. Works like
cutree
except also returns the prototypes of the resulting
clustering.
Usage
protocut(hc, k = NULL, h = NULL)
Arguments
hc |
an object returned by |
k |
the number of clusters desired |
h |
the height at which to cut the tree |
Details
Given a minimax linkage hierarchical clustering, this function cuts the tree
at a given height or so that a specified number of clusters is created. It
returns both the indices of the prototypes and their locations. This latter
information is useful for plotting a dendrogram with prototypes (see
plotwithprototypes
). As with cutree
, if both k and h
are given, h is ignored. Unlike cutree
, in current version k and h
cannot be vectors.
Value
A list corresponding to the clustering from cutting tree:
cl |
vector of cluster memberships |
protos |
vector of prototype
indices corresponding to the k clusters created. |
imerge |
vector describing the nodes where prototypes occur. We use the
naming convention of the |
Author(s)
Jacob Bien and Rob Tibshirani
References
Bien, J., and Tibshirani, R. (2011), "Hierarchical Clustering with Prototypes via Minimax Linkage," The Journal of the American Statistical Association, 106(495), 1075-1084.
See Also
protoclust
, cutree
,
plotwithprototypes
Examples
# generate some data:
set.seed(1)
n <- 100
p <- 2
x <- matrix(rnorm(n * p), n, p)
rownames(x) <- paste("A", 1:n, sep="")
d <- dist(x)
# perform minimax linkage clustering:
hc <- protoclust(d)
# cut the tree to yield a 10-cluster clustering:
k <- 10 # number of clusters
cut <- protocut(hc, k=k)
h <- hc$height[n - k]
# plot dendrogram (and show cut):
plotwithprototypes(hc, imerge=cut$imerge, col=2)
abline(h=h, lty=2)
# get the prototype assigned to each point:
pr <- cut$protos[cut$cl]
# find point farthest from its prototype:
dmat <- as.matrix(d)
ifar <- which.max(dmat[cbind(1:n, pr[1:n])])
# note that this distance is exactly h:
stopifnot(dmat[ifar, pr[ifar]] == h)
# since this is a 2d example, make 2d display:
plot(x, type="n")
points(x, pch=20, col="lightblue")
lines(rbind(x[ifar, ], x[pr[ifar], ]), col=3)
points(x[cut$protos, ], pch=20, col="red")
text(x[cut$protos, ], labels=hc$labels[cut$protos], pch=19)
tt <- seq(0, 2 * pi, length=100)
for (i in cut$protos) {
lines(x[i, 1] + h * cos(tt), x[i, 2] + h * sin(tt))
}