clust {MorphoTools2} | R Documentation |
Hierarchical Clustering
Description
Hierarchical cluster analysis of objects.
Usage
clust(object, distMethod = "Euclidean", clustMethod = "UPGMA", binaryChs = NULL,
nominalChs = NULL, ordinalChs = NULL)
Arguments
object |
an object of class |
distMethod |
the distance measure to be used. This must be one of: |
clustMethod |
the agglomeration method to be used: |
binaryChs , nominalChs , ordinalChs |
names of categorical ordinal, categorical nominal (multistate), and binary characters. Needed for Gower's dissimilarity coefficient only, see details. |
Details
This function performs agglomerative hierarchical clustering. Typically, populations are used as OTUs (operational taxonomic units). Characters are standardised to a zero mean and a unit standard deviation.
Various measures of distance between the observations (rows) are applicable: (1) coefficients of distance for quantitative and binary characters: "Euclidean"
, "Manhattan"
, "Minkowski"
; (2) similarity coefficients for binary characters: "Jaccard"
and simple matching ("simpleMatching"
); (3) coefficient for mixed data: "Gower"
.
Note that the other than default methods for clustering and distance measurement are rarely used in morphometric analyses.
The Gower's dissimilarity coefficient can handle different types of variables. Characters have to be divided into four categories: (1) quantitative characters, (2) categorical ordinal characters, (3) categorical nominal (multistate) characters, and (4) binary characters. All characters are considered to be quantitative characters unless otherwise specified. Other types of characters have to be explicitly specified. To mark characters as ordinal, nominal, or binary, enumerate them by names using ordinalChs
, nominalChs
, and binaryChs
arguments, respectively.
Value
An object of class 'hclust'
. It encodes a stepwise dendrogram.
Examples
data(centaurea)
clustering.UPGMA = clust(centaurea)
plot(clustering.UPGMA, cex = 0.6, frame.plot = TRUE, hang = -1,
main = "", sub = "", xlab = "", ylab = "distance")
# using Gower's method
data = list(
ID = as.factor(c("id1","id2","id3","id4","id5","id6")),
Population = as.factor(c("Pop1", "Pop1", "Pop2", "Pop2", "Pop3", "Pop3")),
Taxon = as.factor(c("TaxA", "TaxA", "TaxA", "TaxB", "TaxB", "TaxB")),
data = data.frame(
stemBranching = c(1, 1, 1, 0, 0, 0), # binaryChs
petalColour = c(1, 1, 2, 3, 3, 3), # nominalChs; 1=white, 2=red, 3=blue
leaves = c(1,1,1,2,2,3), # nominalChs; 1=simple, 2=palmately compound, 3=pinnately compound
taste = c(2, 2, 2, 3, 1, 1), # ordinal; 1=hot, 2=hotter, 3=hottest
stemHeight = c(10, 11, 14, 22, 23, 21), # quantitative
leafLength = c(8, 7.1, 9.4, 1.2, 2.3, 2.1) ) # quantitative
)
attr(data, "class") = "morphodata"
clustering.GOWER = clust(data, distMethod = "Gower", clustMethod = "UPGMA",
binaryChs = c("stemBranching"),
nominalChs = c("petalColour", "leaves"),
ordinalChs = c("taste"))
plot(clustering.GOWER, cex = 0.6, frame.plot = TRUE, hang = -1,
main = "", sub = "", xlab = "", ylab = "distance")