jsTopics {ldaPrototype} | R Documentation |
Pairwise Jensen-Shannon Similarities (Divergences)
Description
Calculates the similarity of all pairwise topic combinations using the Jensen-Shannon Divergence.
Usage
jsTopics(topics, epsilon = 1e-06, progress = TRUE, pm.backend, ncpus)
Arguments
topics |
[ |
epsilon |
[ |
progress |
[ |
pm.backend |
[ |
ncpus |
[ |
Details
The Jensen-Shannon Similarity for two topics and
is calculated by
with is the vocabulary size,
,
and
is the proportion of assignments of the
-th word to the
-th topic. KLD defines the Kullback-Leibler
Divergence calculated by
There is an epsilon
added to every , the count
(not proportion) of assignments to ensure computability with respect to zeros.
Value
[named list
] with entries
sims
[
lower triangular named matrix
] with all pairwise similarities of the given topics.wordslimit
[
integer
] = vocabulary size. SeejaccardTopics
for original purpose.wordsconsidered
[
integer
] = vocabulary size. SeejaccardTopics
for original purpose.param
[
named list
] with parameter specifications fortype
[character(1)
]= "Cosine Similarity"
andepsilon
[numeric(1)
]. See above for explanation.
See Also
Other TopicSimilarity functions:
cosineTopics()
,
dendTopics()
,
getSimilarity()
,
jaccardTopics()
,
rboTopics()
Examples
res = LDARep(docs = reuters_docs, vocab = reuters_vocab, n = 4, K = 10, num.iterations = 30)
topics = mergeTopics(res, vocab = reuters_vocab)
js = jsTopics(topics)
js
sim = getSimilarity(js)
dim(sim)
js1 = jsTopics(topics, epsilon = 1)
sim1 = getSimilarity(js1)
summary((sim1-sim)[lower.tri(sim)])
plot(sim, sim1, xlab = "epsilon = 1e-6", ylab = "epsilon = 1")