tCorpus$lda_fit {corpustools}R Documentation

Estimate a LDA topic model

Description

Estimate an LDA topic model using the LDA function from the topicmodels package. The parameters other than dtm are simply passed to the sampler but provide a workable default. See the description of that function for more information

Usage:

## R6 method for class tCorpus. Use as tc$method (where tc is a tCorpus object).

lda_fit(feature, create_feature=NULL, K=50, num.iterations=500, alpha=50/K,
     eta=.01, burnin=250, context_level=c('document','sentence'), ...)

Arguments

feature

the name of the feature columns

create_feature

optionally, add a feature column that indicates the topic to which a feature was assigned (in the last iteration). Has to be a character string, that will be the name of the new feature column

K

the number of clusters

num.iterations

the number of iterations

method

set method. see documentation for LDA function of the topicmodels package

alpha

the alpha parameter

eta

the eta parameter#'

burnin

The number of burnin iterations

Value

A fitted LDA model, and optionally a new column in the tcorpus (added by reference)

Examples


if (interactive()) {
  tc = create_tcorpus(sotu_texts, doc_column = 'id')
  tc$preprocess('token', 'feature', remove_stopwords = TRUE, use_stemming = TRUE, min_freq=10)
  set.seed(1)
  m = tc$lda_fit('feature', create_feature = 'lda', K = 5, alpha = 0.1)
  m
  topicmodels::terms(m, 10)
  tc$tokens
}


[Package corpustools version 0.5.1 Index]