getPrototype {ldaPrototype} | R Documentation |
Determine the Prototype LDA
Description
Returns the Prototype LDA of a set of LDAs. This set is given as
LDABatch
object, LDARep
object, or as list of LDAs.
If the matrix of S-CLOP scores sclop
is passed, no calculation is needed/done.
Usage
getPrototype(...)
## S3 method for class 'LDARep'
getPrototype(
x,
vocab,
limit.rel,
limit.abs,
atLeast,
progress = TRUE,
pm.backend,
ncpus,
keepTopics = FALSE,
keepSims = FALSE,
keepLDAs = FALSE,
sclop,
...
)
## S3 method for class 'LDABatch'
getPrototype(
x,
vocab,
limit.rel,
limit.abs,
atLeast,
progress = TRUE,
pm.backend,
ncpus,
keepTopics = FALSE,
keepSims = FALSE,
keepLDAs = FALSE,
sclop,
...
)
## Default S3 method:
getPrototype(
lda,
vocab,
id,
job,
limit.rel,
limit.abs,
atLeast,
progress = TRUE,
pm.backend,
ncpus,
keepTopics = FALSE,
keepSims = FALSE,
keepLDAs = FALSE,
sclop,
...
)
Arguments
... |
additional arguments |
x |
|
vocab |
[ |
limit.rel |
[0,1] |
limit.abs |
[ |
atLeast |
[ |
progress |
[ |
pm.backend |
[ |
ncpus |
[ |
keepTopics |
[ |
keepSims |
[ |
keepLDAs |
[ |
sclop |
[ |
lda |
[ |
id |
[ |
job |
[ |
Details
While LDAPrototype
marks the overall shortcut for performing
multiple LDA runs and choosing the Prototype of them, getPrototype
just hooks up at determining the Prototype. The generation of multiple LDAs
has to be done before use of this function. The function is flexible enough
to use it at at least two steps/parts of the analysis: After generating the
LDAs (no matter whether as LDABatch or LDARep object) or after determing
the pairwise SCLOP values.
To save memory a lot of interim calculations are discarded by default.
If you use parallel computation, no progress bar is shown.
For details see the details sections of the workflow functions.
Value
[named list
] with entries
id
[
character(1)
] See above.protoid
[
character(1)
] Name (ID) of the determined Prototype LDA.lda
List of
LDA
objects of the determined Prototype LDA and - ifkeepLDAs
isTRUE
- all considered LDAs.jobs
[
data.table
] with parameter specifications for the LDAs.param
[
named list
] with parameter specifications forlimit.rel
[0,1],limit.abs
[integer(1)
] andatLeast
[integer(1)
]. See above for explanation.topics
[
named matrix
] with the count of vocabularies (row wise) in topics (column wise).sims
[
lower triangular named matrix
] with all pairwise jaccard similarities of the given topics.wordslimit
[
integer
] with counts of words determined as relevant based onlimit.rel
andlimit.abs
.wordsconsidered
[
integer
] with counts of considered words for similarity calculation. Could differ fromwordslimit
, ifatLeast
is greater than zero.sclop
[
symmetrical named matrix
] with all pairwise S-CLOP scores of the given LDA runs.
See Also
Other shortcut functions:
LDAPrototype()
Other PrototypeLDA functions:
LDAPrototype()
,
getSCLOP()
Other workflow functions:
LDARep()
,
SCLOP()
,
dendTopics()
,
jaccardTopics()
,
mergeTopics()
Examples
res = LDARep(docs = reuters_docs, vocab = reuters_vocab,
n = 4, K = 10, num.iterations = 30)
topics = mergeTopics(res, vocab = reuters_vocab)
jacc = jaccardTopics(topics, atLeast = 2)
dend = dendTopics(jacc)
sclop = SCLOP.pairwise(jacc)
getPrototype(lda = getLDA(res), sclop = sclop)
proto = getPrototype(res, vocab = reuters_vocab, keepSims = TRUE,
limit.abs = 20, atLeast = 10)
proto
getPrototype(proto) # = getLDA(proto)
getConsideredWords(proto)
# > 10 if there is more than one word which is the 10-th often word (ties)
getRelevantWords(proto)
getSCLOP(proto)