R: Cluster analysis for compositional data

clustCoDa {robCompositions}

R Documentation

Cluster analysis for compositional data

Description

Clustering in orthonormal coordinates or by using the Aitchison distance

Usage

clustCoDa(
  x,
  k = NULL,
  method = "Mclust",
  scale = "robust",
  transformation = "pivotCoord",
  distMethod = NULL,
  iter.max = 100,
  vals = TRUE,
  alt = NULL,
  bic = NULL,
  verbose = TRUE
)

## S3 method for class 'clustCoDa'
plot(
  x,
  y,
  ...,
  normalized = FALSE,
  which.plot = "clusterMeans",
  measure = "silwidths"
)

Arguments

`x`	compositional data represented as a data.frame
`k`	number of clusters
`method`	clustering method. One of Mclust, cmeans, kmeansHartigan, cmeansUfcl, pam, clara, fanny, ward.D2, single, hclustComplete, average, mcquitty, median, centroid
`scale`	if orthonormal coordinates should be normalized.
`transformation`	default are the isometric logratio coordinates. Can only used when distMethod is not Aitchison.
`distMethod`	Distance measure to be used. If “Aitchison”, then transformation should be “identity”.
`iter.max`	parameter if kmeans is chosen. The maximum number of iterations allowed
`vals`	if cluster validity measures should be calculated
`alt`	a known partitioning can be provided (for special cluster validity measures)
`bic`	if TRUE then the BIC criteria is evaluated for each single cluster as validity measure
`verbose`	if TRUE additional print output is provided
`y`	the y coordinates of points in the plot, optional if x is an appropriate structure.
`...`	additional parameters for print method passed through
`normalized`	results gets normalized before plotting. Normalization is done by z-transformation applied on each variable.
`which.plot`	currently the only plot. Plot of cluster centers.
`measure`	cluster validity measure to be considered for which.plot equals “partMeans”

Details

The compositional data set is either internally represented by orthonormal coordiantes before a cluster algorithm is applied, or - depending on the choice of parameters - the Aitchison distance is used.

Value

all relevant information such as cluster centers, cluster memberships, and cluster statistics.

Author(s)

Matthias Templ (accessing the basic features of hclust, Mclust, kmeans, etc. that are all written by others)

References

M. Templ, P. Filzmoser, C. Reimann. Cluster analysis applied to regional geochemical data: Problems and possibilities. Applied Geochemistry, 23 (8), 2198–2213, 2008

Templ, M., Filzmoser, P., Reimann, C. (2008) Cluster analysis applied to regional geochemical data: Problems and possibilities, Applied Geochemistry, 23 (2008), pages 2198 - 2213.

Examples

data(expenditures)
x <- expenditures
rr <- clustCoDa(x, k=6, scale = "robust", transformation = "pivotCoord")
rr2 <- clustCoDa(x, k=6, distMethod = "Aitchison", scale = "none", 
                 transformation = "identity")
rr3 <- clustCoDa(x, k=6, distMethod = "Aitchison", method = "single",
                 transformation = "identity", scale = "none")
                 
## Not run: 
require(reshape2)
plot(rr)
plot(rr, normalized = TRUE)
plot(rr, normalized = TRUE, which.plot = "partMeans")

## End(Not run)

[Package robCompositions version 2.4.1 Index]