SCLOP {ldaPrototype} | R Documentation |
Similarity/Stability of multiple sets of Objects using Clustering with Local Pruning
Description
The function SCLOP
calculates the S-CLOP value for the best possible
local pruning state of a dendrogram from dendTopics
.
The function pruneSCLOP
supplies the corresponding pruning state itself.
To get all pairwise S-CLOP scores of two LDA runs, the function SCLOP.pairwise
can be used. It returns a matrix of the pairwise S-CLOP scores.
All three functions use the function disparitySum
to calculate the
least possible sum of disparities (on the best possible local pruning state)
on a given dendrogram.
Usage
SCLOP(dend)
disparitySum(dend)
SCLOP.pairwise(sims)
Arguments
dend |
[ |
sims |
[ |
Details
For one specific cluster and
LDA Runs the disparity is calculated by
while
contains the number of topics that belong to the different LDA runs and that
occur in cluster
.
The function disparitySum
returns the least possible sum of disparities
for the best possible pruning state
with
.
The highest possible value for
is limited by
with denotes the corresponding worst case pruning state. This worst
case scenario is useful for normalizing the SCLOP scores.
The function SCLOP
then calculates the value
where .
Value
SCLOP
[0,1] value specifying the S-CLOP for the best possible local pruning state of the given dendrogram.
disparitySum
[
numeric(1)
] value specifying the least possible sum of disparities on the given dendrogram.SCLOP.pairwise
[
symmetrical named matrix
] with all pairwise S-CLOP scores of the given LDA runs.
See Also
Other SCLOP functions:
pruneSCLOP()
Other workflow functions:
LDARep()
,
dendTopics()
,
getPrototype()
,
jaccardTopics()
,
mergeTopics()
Examples
res = LDARep(docs = reuters_docs, vocab = reuters_vocab, n = 4, K = 10, num.iterations = 30)
topics = mergeTopics(res, vocab = reuters_vocab)
jacc = jaccardTopics(topics, atLeast = 2)
dend = dendTopics(jacc)
SCLOP(dend)
disparitySum(dend)
SCLOP.pairwise(jacc)
SCLOP.pairwise(getSimilarity(jacc))