top.topic.words {lda} | R Documentation |
Get the Top Words and Documents in Each Topic
Description
This function takes a model fitted using
lda.collapsed.gibbs.sampler
and returns a matrix of the
top words in each topic.
Usage
top.topic.words(topics, num.words = 20, by.score = FALSE)
top.topic.documents(document_sums, num.documents = 20, alpha = 0.1)
Arguments
topics |
For |
num.words |
For |
document_sums |
For |
num.documents |
For |
by.score |
If by.score is set to |
alpha |
The scalar value of the Dirichlet hyperparameter for topic proportions. |
Value
For top.topic.words
, a num.words \times K
character matrix where each column contains
the top words for that topic.
For top.topic.documents
, a num.documents \times K
integer matrix where each column contains
the top documents for that topic. The entries in the matrix are
column-indexed references into document_sums
.
Author(s)
Jonathan Chang (slycoder@gmail.com)
References
Blei, David M. and Ng, Andrew and Jordan, Michael. Latent Dirichlet allocation. Journal of Machine Learning Research, 2003.
See Also
lda.collapsed.gibbs.sampler
for the format of topics.
predictive.distribution
demonstrates another use for a fitted
topic matrix.
Examples
## From demo(lda).
data(cora.documents)
data(cora.vocab)
K <- 10 ## Num clusters
result <- lda.collapsed.gibbs.sampler(cora.documents,
K, ## Num clusters
cora.vocab,
25, ## Num iterations
0.1,
0.1)
## Get the top words in the cluster
top.words <- top.topic.words(result$topics, 5, by.score=TRUE)
## top.words:
## [,1] [,2] [,3] [,4] [,5]
## [1,] "decision" "network" "planning" "learning" "design"
## [2,] "learning" "time" "visual" "networks" "logic"
## [3,] "tree" "networks" "model" "neural" "search"
## [4,] "trees" "algorithm" "memory" "system" "learning"
## [5,] "classification" "data" "system" "reinforcement" "systems"
## [,6] [,7] [,8] [,9] [,10]
## [1,] "learning" "models" "belief" "genetic" "research"
## [2,] "search" "networks" "model" "search" "reasoning"
## [3,] "crossover" "bayesian" "theory" "optimization" "grant"
## [4,] "algorithm" "data" "distribution" "evolutionary" "science"
## [5,] "complexity" "hidden" "markov" "function" "supported"