clusterTopics {tosca} | R Documentation |
Cluster Analysis
Description
This function makes a cluster analysis using the Hellinger distance.
Usage
clusterTopics(
ldaresult,
file,
tnames = NULL,
method = "average",
width = 30,
height = 15,
...
)
Arguments
ldaresult |
The result of a function call |
file |
File for the dendogram pdf. |
tnames |
Character vector as label for the topics. |
method |
Method statement from |
width |
Grafical parameter for pdf output. See |
height |
Grafical parameter for pdf output. See |
... |
Additional parameter for |
Details
This function is useful to analyze topic similarities and while evaluating the right number of topics of LDAs.
Value
A dendogram as pdf and a list containing
dist |
A distance matrix |
clust |
The result from |
Examples
texts <- list(A="Give a Man a Fish, and You Feed Him for a Day.
Teach a Man To Fish, and You Feed Him for a Lifetime",
B="So Long, and Thanks for All the Fish",
C="A very able manipulative mathematician, Fisher enjoys a real mastery
in evaluating complicated multiple integrals.")
corpus <- textmeta(meta=data.frame(id=c("A", "B", "C", "D"),
title=c("Fishing", "Don't panic!", "Sir Ronald", "Berlin"),
date=c("1885-01-02", "1979-03-04", "1951-05-06", "1967-06-02"),
additionalVariable=1:4, stringsAsFactors=FALSE), text=texts)
corpus <- cleanTexts(corpus)
wordlist <- makeWordlist(corpus$text)
ldaPrep <- LDAprep(text=corpus$text, vocab=wordlist$words)
LDA <- LDAgen(documents=ldaPrep, K = 3L, vocab=wordlist$words, num.words=3)
clusterTopics(ldaresult=LDA)
[Package tosca version 0.3-2 Index]