table.collapse {CAinterprTools} | R Documentation |
This function allows to collapse the rows and columns of the input
contingency table on the basis of the results of a hierarchical
clustering.
table.collapse(data, graph = FALSE)
data |
Name of the dataset (must be in dataframe format) |
graph |
Logical (TRUE/FALSE); it takes TRUE if the user wants the row and colum profiles dendrograms to be produced. |
The function returns a list containing the input table, the rows-collapsed
table, the columns-collapsed table, and a table with both rows and columns
collapsed. It optionally returns two dendrograms (one for the row profiles,
one for the column profiles) representing the clusters.
The hierarchical clustering is obtained using the FactoMineR's 'HCPC()' function.
Rationale: clustering rows and/or columns of a table could interest the users
who want to know where a "significant association is concentrated" by
"collecting together similar rows (or columns) in discrete groups" (Greenacre
M, Correspondence Analysis in Practice, Boca Raton-London-New York,
Chapman&Hall/CRC 2007, pp. 116, 120). Rows and/or columns are progressively
aggregated in a way in which every successive merging produces the smallest
change in the table's inertia. The underlying logic lies in the fact that
rows (or columns) whose merging produces a small change in table's inertia
have similar profiles. This procedure can be thought of as maximizing the
between-group inertia and minimizing the within-group inertia.
A method essentially similar is that provided by the 'FactoMineR' package (Husson F,
Le S, Pages J, Exploratory Multivariate Analysis by Example Using R, Boca
Raton-London-New York, CRC Press, pp. 177-185). The cluster solution is based
on the following rationale: a division into Q (i.e., a given number of)
clusters is suggested when the increase in between-group inertia attained
when passing from a Q-1 to a Q partition is greater than that from a Q to a
Q+1 clusters partition. In other words, during the process of rows (or
columns) merging, if the following aggregation raises highly the
within-group inertia, it means that at the further step very different
profiles are being aggregated.
data(greenacre_data) #collapse the table, store the results into an object called 'res', and return 2 dendrograms res <- table.collapse(greenacre_data, graph=TRUE)