as.LDA {sentopics}R Documentation

Conversions from other packages to LDA

Description

These functions converts estimated models from other topic modeling packages to the format used by sentopics.

Usage

as.LDA(x, ...)

## S3 method for class 'STM'
as.LDA(x, docs, ...)

## S3 method for class 'LDA_Gibbs'
as.LDA(x, docs, ...)

## S3 method for class 'LDA_VEM'
as.LDA(x, docs, ...)

## S3 method for class 'textmodel_lda'
as.LDA(x, ...)

as.LDA_lda(list, docs, alpha, eta)

## S3 method for class 'keyATM_output'
as.LDA(x, docs, ...)

Arguments

x

an estimated topic model from stm, topicmodels or seededlda.

...

arguments passed to other methods.

docs

for some objects, the documents used to initialize the model.

list

the list containing an estimated model from lda.

alpha

for lda models, the document-topic mixture hyperparameter. If missing, the hyperparameter will be set to 50/K.

eta

for lda models, the topic-word mixture hyperparameter. Other packages refer to this hyperparameter as beta. If missing, the hyperparameter will be set to 0.01.

Details

Some models do not store the topic assignment of each word (for example, estimated through variational inference). For these, the conversion is limited and some functionalities of sentopics will be disabled. The list of affected functions is subject to change and currently includes fit(), mergeTopics() and rJST.LDA().

Since models from the lda package are simply lists of outputs, the function as.LDA_lda() is not related to the other methods and should be applied directly on lists containing a model.

Value

A S3 list of class LDA, as if it was created and estimated using LDA() and fit().

Examples


## stm
library("stm")
stm <- stm(poliblog5k.docs, poliblog5k.voc, K=25,
           prevalence=~rating, data=poliblog5k.meta,
           max.em.its=2, init.type="Spectral")
as.LDA(stm, docs = poliblog5k.docs)

## lda
library("lda")
data("cora.documents")
data("cora.vocab")
lda <- lda.collapsed.gibbs.sampler(cora.documents,
                                   5, ## Num clusters
                                   cora.vocab,
                                   100, ## Num iterations
                                   0.1,
                                   0.1)
LDA <- as.LDA_lda(lda, docs = cora.documents, alpha = .1, eta = .1)

## topicmodels
data("AssociatedPress", package = "topicmodels")
lda <- topicmodels::LDA(AssociatedPress[1:20,],
                        control = list(alpha = 0.1), k = 2)
LDA <- as.LDA(lda, docs = AssociatedPress[1:20,])

## seededlda
library("seededlda")
lda <- textmodel_lda(dfm(ECB_press_conferences_tokens),
                     k = 6, max_iter = 100)
LDA <- as.LDA(lda)

## keyATM
library("keyATM")
data(keyATM_data_bills, package = "keyATM")
keyATM_docs <- keyATM_read(keyATM_data_bills$doc_dfm)
out <- keyATM(docs = keyATM_docs, model = "base",
              no_keyword_topics = 5,
              keywords = keyATM_data_bills$keywords)
LDA <- as.LDA(out, docs = keyATM_docs)


[Package sentopics version 0.7.3 Index]