predict.topics {maptpx}R Documentation

topic predict

Description

Predict function for Topic Models

Usage

## S3 method for class 'topics'
predict( object, newcounts, loglhd=FALSE, ... )

Arguments

object

An output object from the topics function, or the corresponding matrix of estimated topics.

newcounts

An nrow(object$theta)-column matrix of multinomial phrase/category counts for new documents/observations. Can be either a simple matrix or a simple_triplet_matrix.

loglhd

Whether or not to calculate and return sum(x*log(p)), the un-normalized log likelihood.

...

Additional arguments to the undocumented internal tpx* functions.

Details

Under the default mixed-membership topic model, this function uses sequential quadratic programming to fit topic weights \Omega for new documents. Estimates for each new \omega_i are, conditional on object$theta, MAP in the (K-1)-dimensional logit transformed parameter space.

Value

The output is an nrow(newcounts) by object$K matrix of document topic weights, or a list with including these weights as W and the log likelihood as L.

Author(s)

Matt Taddy mataddy@gmail.com

References

Taddy (2012), On Estimation and Selection for Topic Models. http://arxiv.org/abs/1109.4518

See Also

topics, plot.topics, summary.topics, congress109

Examples


## Simulate some data
omega <- t(rdir(500, rep(1/10,10)))
theta <- rdir(10, rep(1/1000,1000))
Q <- omega%*%t(theta)
counts <- matrix(ncol=1000, nrow=500)
totals <- rpois(500, 200)
for(i in 1:500){ counts[i,] <- rmultinom(1, size=totals[i], prob=Q[i,]) }

## predict omega given theta
W <- predict.topics( theta, counts )
plot(W, omega, pch=21, bg=8)


[Package maptpx version 1.9-7 Index]