| FitLdaModel {textmineR} | R Documentation | 
Fit a Latent Dirichlet Allocation topic model
Description
Fit a Latent Dirichlet Allocation topic model using collapsed Gibbs sampling.
Usage
FitLdaModel(
  dtm,
  k,
  iterations = NULL,
  burnin = -1,
  alpha = 0.1,
  beta = 0.05,
  optimize_alpha = FALSE,
  calc_likelihood = FALSE,
  calc_coherence = TRUE,
  calc_r2 = FALSE,
  ...
)
Arguments
| dtm | A document term matrix or term co-occurrence matrix of class dgCMatrix | 
| k | Integer number of topics | 
| iterations | Integer number of iterations for the Gibbs sampler to run. A future version may include automatic stopping criteria. | 
| burnin | Integer number of burnin iterations. If  | 
| alpha | Vector of length  | 
| beta | Vector of length  | 
| optimize_alpha | Logical. Do you want to optimize alpha every 10 Gibbs iterations?
Defaults to  | 
| calc_likelihood | Do you want to calculate the likelihood every 10 Gibbs iterations?
Useful for assessing convergence. Defaults to  | 
| calc_coherence | Do you want to calculate probabilistic coherence of topics
after the model is trained? Defaults to  | 
| calc_r2 | Do you want to calculate R-squared after the model is trained?
Defaults to  | 
| ... | Other arguments to be passed to  | 
Details
EXPLAIN IMPLEMENTATION DETAILS
Value
Returns an S3 object of class c("LDA", "TopicModel"). DESCRIBE MORE
Examples
# load some data
data(nih_sample_dtm)
# fit a model 
set.seed(12345)
m <- FitLdaModel(dtm = nih_sample_dtm[1:20,], k = 5,
                 iterations = 200, burnin = 175)
str(m)
# predict on held-out documents using gibbs sampling "fold in"
p1 <- predict(m, nih_sample_dtm[21:100,], method = "gibbs",
              iterations = 200, burnin = 175)
# predict on held-out documents using the dot product method
p2 <- predict(m, nih_sample_dtm[21:100,], method = "dot")
# compare the methods
barplot(rbind(p1[1,],p2[1,]), beside = TRUE, col = c("red", "blue"))