| LDA_set {LDATS} | R Documentation |
Run a set of Latent Dirichlet Allocation models
Description
For a given dataset consisting of counts of words across
multiple documents in a corpus, conduct multiple Latent Dirichlet
Allocation (LDA) models (using the Variational Expectation
Maximization (VEM) algorithm; Blei et al. 2003) to account for [1]
uncertainty in the number of latent topics and [2] the impact of initial
values in the estimation procedure.
LDA_set is a list wrapper of LDA
in the topicmodels package (Grun and Hornik 2011).
check_LDA_set_inputs checks that all of the inputs
are proper for LDA_set (that the table of observations is
conformable to a matrix of integers, the number of topics is an integer,
the number of seeds is an integer and the controls list is proper).
Usage
LDA_set(document_term_table, topics = 2, nseeds = 1, control = list())
check_LDA_set_inputs(document_term_table, topics, nseeds, control)
Arguments
document_term_table |
Table of observation count data (rows:
documents, columns: terms. May be a class |
topics |
Vector of the number of topics to evaluate for each model.
Must be conformable to |
nseeds |
Number of seeds (replicate starts) to use for each
value of |
control |
A |
Value
LDA_set: list (class: LDA_set) of LDA models
(class: LDA_VEM).
check_LDA_set_inputs: an error message is thrown if any input is
improper, otherwise NULL.
References
Blei, D. M., A. Y. Ng, and M. I. Jordan. 2003. Latent Dirichlet Allocation. Journal of Machine Learning Research 3:993-1022. link.
Grun B. and K. Hornik. 2011. topicmodels: An R Package for Fitting Topic Models. Journal of Statistical Software 40:13. link.
Examples
data(rodents)
lda_data <- rodents$document_term_table
r_LDA <- LDA_set(lda_data, topics = 2, nseeds = 2)