greenclust {greenclust}R Documentation

Row Clustering Using Greenacre's Method


Iteratively collapses the rows of a table (typically a contingency table) by selecting the pair of rows each time whose combination creates the smalled loss of chi-squared.


greenclust(x, correct = FALSE, verbose = FALSE)



a numeric matrix or data frame


a logical indicating whether to apply a continuity correction if and when the clustered table reaches a 2x2 dimension.


if TRUE, prints the clustered table along with r-squared and p-value at each step


An object of class greenclust which is compatible with most hclust object functions, such as plot() and rect.hclust(). The height vector represents the proportion of chi-squared, relative to the original table, seen at each clustering step. The greenclust object also includes a vector for the chi-squared test p-value at each step and a boolean vector indicating whether the step had a tie for "winner".


Greenacre, M.J. (1988) "Clustering the Rows and Columns of a Contingency Table," Journal of Classification 5, 39-51. doi:10.1007/BF01901670

See Also

greencut, greenplot, assign.cluster


# Combine Titanic passenger attributes into a single category
tab <- t(, 4:1, FUN=sum)))
# Remove rows with all zeros
tab <- tab[apply(tab, 1, sum) > 0, ]

# Perform clustering on contingency table
grc <- greenclust(tab)

# Plot r-squared and p-values for each potential cut point

# Get clusters at suggested cut point
clusters <- greencut(grc)

# Plot dendrogram with clusters marked
rect.hclust(grc, max(clusters))

[Package greenclust version 1.1.1 Index]