heatmap_words {deepMOU} | R Documentation |
Heatmap of word frequencies by cluster
Description
Displays the heatmap of the cluster frequency distributions of the most frequent terms sorted by the most informative ones.
Usage
heatmap_words(
x,
clusters,
n_words = 50,
legend_position = "bottom",
font_size = 12,
low_color = "grey92",
top_color = "red",
main = "Row frequencies of terms distribution",
xlabel = NULL,
ylabel = NULL,
legend_title = "Entropy"
)
Arguments
x |
Document-term matrix describing the frequency of terms that occur in a collection of documents. Rows correspond to documents in the collection and columns correspond to terms. |
clusters |
Integer vector of length of the number of cases, which indicates a clustering. The clusters have to be numbered from 1 to the number of clusters. |
n_words |
Number of words to include in the heatmap (default is 50). |
legend_position |
Position of the legend ( |
font_size |
Text size in pts (default is 12). |
low_color |
Base color for terms with no occurrence in a cluster (default is |
top_color |
Base color for terms concentrated in a single cluster (default is |
main |
A title for the plot. Default is |
xlabel |
A title for the x-axis. Default is |
ylabel |
A title for the y-axis. Default is |
legend_title |
A title for the legend. Default is |
Details
Takes as input the bag-of-words matrix and returns a heatmap displaying the row frequency distribution of terms according to the clusters. Words are sorted by entropy.
Value
A graphical aid to describe the clusters according to the most informative words.
Examples
# Load the CNAE2 dataset
data("CNAE2")
# Get document labels by clustering using mou_EM
mou_CNAE2 = mou_EM(x = CNAE2, k = 2)
# Usage of the function
heatmap_words(x = mou_CNAE2$x, clusters = mou_CNAE2$clusters)