ldaformat2dtm {topicmodels} | R Documentation |
Transform data from and for use with the lda package
Description
Data from the lda package is transformed to a document-term matrix. This data format can be used to fit topic models using package topicmodels.
Data in form of a document-term matrix is transformed to the LDA format used by package lda.
Usage
ldaformat2dtm(documents, vocab, omit_empty = TRUE)
dtm2ldaformat(x, omit_empty = TRUE)
Arguments
documents |
A |
vocab |
A |
x |
An object of class |
omit_empty |
A logical indicating if empty documents should be removed when converting the objects. By default empty documents are removed. |
Value
An object of class "DocumentTermMatrix"
is returned by
ldaformat2dtm()
and a list with components "documents"
and "vocab"
by dtm2ldaformat()
.
Author(s)
Bettina Gruen
Examples
if (require("lda")) {
data("cora.documents", package = "lda")
data("cora.vocab", package = "lda")
dtm <- ldaformat2dtm(cora.documents, cora.vocab)
cora <- dtm2ldaformat(dtm)
all.equal(cora, list(documents = cora.documents,
vocab = cora.vocab))
}