slda.predict {lda}R Documentation

Predict the response variable of documents using an sLDA model.


These functions take a fitted sLDA model and predict the value of the response variable (or document-topic sums) for each given document.


slda.predict(documents, topics, model, alpha, eta,
num.iterations = 100, average.iterations = 50, trace = 0L)

slda.predict.docsums(documents, topics, alpha, eta,
num.iterations = 100, average.iterations = 50, trace = 0L)



A list of document matrices comprising a corpus, in the format described in lda.collapsed.gibbs.sampler.


A K×VK \times V matrix where each entry is an integer that is the number of times the word (column) has been allocated to the topic (row) (a normalised version of this is sometimes denoted βw,k\beta_{w,k} in the literature, see details). The column names should correspond to the words in the vocabulary. The topics field from the output of slda.em can be used.


A fitted model relating a document's topic distribution to the response variable. The model field from the output of slda.em can be used.


The scalar value of the Dirichlet hyperparameter for topic proportions. See references for details.


The scalar value of the Dirichlet hyperparamater for topic multinomials.


Number of iterations of inference to perform on the documents.


Number of samples to average over to produce the predictions.


When trace is greater than zero, diagnostic messages will be output. Larger values of trace imply more messages.


Inference is first performed on the documents by using Gibbs sampling and holding the word-topic matrix βw,k\beta_{w,k} constant. Typically for a well-fit model only a small number of iterations are required to obtain good fits for new documents. These topic vectors are then piped through model to yield numeric predictions associated with each document.


For slda.predict, a numeric vector of the same length as documents giving the predictions. For slda.predict.docsums, a K×NK \times N matrix of document assignment counts.


Jonathan Chang (


Blei, David M. and McAuliffe, John. Supervised topic models. Advances in Neural Information Processing Systems, 2008.

See Also

See lda.collapsed.gibbs.sampler for a description of the format of the input data, as well as more details on the model.

See predictive.distribution if you want to make predictions about the contents of the documents instead of the response variables.


## The sLDA demo shows an example usage of this function.
## Not run: demo(slda)

[Package lda version 1.5.2 Index]