hcluster {amap} | R Documentation |
Hierarchical Clustering
Description
Hierarchical cluster analysis.
Usage
hcluster(x, method = "euclidean", diag = FALSE, upper = FALSE,
link = "complete", members = NULL, nbproc = 2,
doubleprecision = TRUE)
Arguments
x |
A numeric matrix of data, or an object that can be coerced to such a matrix (such as a numeric vector or a data frame with all numeric columns). Or an object of class "exprSet". |
method |
the distance measure to be used. This must be one of
|
diag |
logical value indicating whether the diagonal of the
distance matrix should be printed by |
upper |
logical value indicating whether the upper triangle of the
distance matrix should be printed by |
link |
the agglomeration method to be used. This should
be (an unambiguous abbreviation of) one of
|
members |
|
nbproc |
integer, number of subprocess for parallelization [Linux & Mac only] |
doubleprecision |
True: use of double precision for distance matrix computation; False: use simple precision |
Details
This function is a mix of function hclust
and function
dist
. hcluster(x, method = "euclidean",link = "complete")
= hclust(dist(x, method = "euclidean"),method = "complete"))
It use twice less memory, as it doesn't store distance matrix.
For more details, see documentation of hclust
and Dist
.
Value
An object of class hclust which describes the tree produced by the clustering process. The object is a list with components:
merge |
an |
height |
a set of |
order |
a vector giving the permutation of the original
observations suitable for plotting, in the sense that a cluster
plot using this ordering and matrix |
labels |
labels for each of the objects being clustered. |
call |
the call which produced the result. |
method |
the cluster method that has been used. |
dist.method |
the distance that has been used to create |
There is a print
and a plot
method for
hclust
objects.
The plclust()
function is basically the same as the plot method,
plot.hclust
, primarily for back compatibility with S-plus. Its
extra arguments are not yet implemented.
Note
Multi-thread (parallelisation) is disable on Windows.
Author(s)
The hcluster
function is based on C code adapted from Cran
Fortran routine
by Antoine Lucas.
References
Antoine Lucas and Sylvain Jasson, Using amap and ctc Packages for Huge Clustering, R News, 2006, vol 6, issue 5 pages 58-60.
See Also
Examples
data(USArrests)
hc <- hcluster(USArrests,link = "ave")
plot(hc)
plot(hc, hang = -1)
## Do the same with centroid clustering and squared Euclidean distance,
## cut the tree into ten clusters and reconstruct the upper part of the
## tree from the cluster centers.
hc <- hclust(dist(USArrests)^2, "cen")
memb <- cutree(hc, k = 10)
cent <- NULL
for(k in 1:10){
cent <- rbind(cent, colMeans(USArrests[memb == k, , drop = FALSE]))
}
hc1 <- hclust(dist(cent)^2, method = "cen", members = table(memb))
opar <- par(mfrow = c(1, 2))
plot(hc, labels = FALSE, hang = -1, main = "Original Tree")
plot(hc1, labels = FALSE, hang = -1, main = "Re-start from 10 clusters")
par(opar)
## other combinaison are possible
hc <- hcluster(USArrests,method = "euc",link = "ward", nbproc= 1,
doubleprecision = TRUE)
hc <- hcluster(USArrests,method = "max",link = "single", nbproc= 2,
doubleprecision = TRUE)
hc <- hcluster(USArrests,method = "man",link = "complete", nbproc= 1,
doubleprecision = TRUE)
hc <- hcluster(USArrests,method = "can",link = "average", nbproc= 2,
doubleprecision = TRUE)
hc <- hcluster(USArrests,method = "bin",link = "mcquitty", nbproc= 1,
doubleprecision = FALSE)
hc <- hcluster(USArrests,method = "pea",link = "median", nbproc= 2,
doubleprecision = FALSE)
hc <- hcluster(USArrests,method = "abspea",link = "median", nbproc= 2,
doubleprecision = FALSE)
hc <- hcluster(USArrests,method = "cor",link = "centroid", nbproc= 1,
doubleprecision = FALSE)
hc <- hcluster(USArrests,method = "abscor",link = "centroid", nbproc= 1,
doubleprecision = FALSE)
hc <- hcluster(USArrests,method = "spe",link = "complete", nbproc= 2,
doubleprecision = FALSE)
hc <- hcluster(USArrests,method = "ken",link = "complete", nbproc= 2,
doubleprecision = FALSE)