predictive.link.probability {lda} | R Documentation |
Use the RTM to predict whether a link exists between two documents.
Description
This function takes a fitted LDA-type model (e.g., LDA or RTM) and makes predictions about the likelihood of a link existing between pairs of documents.
Usage
predictive.link.probability(edgelist, document_sums, alpha, beta)
Arguments
edgelist |
A two-column integer matrix where each row represents an edge on which to make a prediction. An edge is expressed as a pair of integer indices (1-indexed) into the columns (i.e., documents) of document_sums (see below). |
document_sums |
A |
alpha |
The value of the Dirichlet hyperparamter generating the distribution over document_sums. This, in effect, smooths the similarity between documents. |
beta |
A numeric vector of regression weights which is used to determine
the similarity between two vectors (see details). Arguments will be
recycled to create a vector of length |
Details
Whether or not a link exists between two documents i
and j
is a function of the weighted inner product of the
document_sums[,i]
and document_sums[,j]
. After
normalizing document_sums
column-wise, this inner
product is weighted by beta.
This quantity is then passed to a
link probability function. Like
rtm.collapsed.gibbs.sampler
in this package, only the
exponential link probability function is supported. Note that
quantities are automatically scaled to be between 0 and 1.
Value
A numeric vector of length dim(edgelist)[1]
, representing the
probability of a link existing between each pair of documents given in
the edge list.
Author(s)
Jonathan Chang (slycoder@gmail.com)
References
Chang, Jonathan and Blei, David M. Relational Topic Models for Document Networks. Artificial intelligence and statistics. 2009.
See Also
rtm.collapsed.gibbs.sampler
for the format of
document_sums. links.as.edgelist
produces values
for edgelist. predictive.distribution
makes
predictions about document content instead.
Examples
## See demo.
## Not run: demo(rtm)