intruderWords {tosca} | R Documentation |
Function to validate the fit of the LDA model
Description
This function validates a LDA result by presenting a mix of words from a topic and intruder words to a human user, who has to identity them.
Usage
intruderWords(
beta = NULL,
byScore = TRUE,
numTopwords = 30L,
numIntruder = 1L,
numOutwords = 5L,
noTopic = TRUE,
printSolution = FALSE,
oldResult = NULL,
test = FALSE,
testinput = NULL
)
Arguments
beta |
A matrix of word-probabilities or frequency table for the topics (e.g. the |
byScore |
Logical: Should the score of |
numTopwords |
The number of topwords to be used for the intruder words |
numIntruder |
Intended number of intruder words. If |
numOutwords |
Integer: Number of words per topic, including the intruder words. |
noTopic |
Logical: Is |
printSolution |
tba |
oldResult |
Result object from an unfinished run of |
test |
Logical: Enables test mode |
testinput |
Input for function tests |
Value
Object of class IntruderWords
. List of 7
result |
Matrix of 3 columns. Each row represents one topic. All values are 0 if the topic did not run before. |
beta |
Parameter of the function call |
byScore |
Parameter of the function call |
numTopwords |
Parameter of the function call |
numIntruder |
Parameter of the function call |
numOutwords |
Parameter of the function call |
noTopic |
Parameter of the function call |
References
Chang, Jonathan and Sean Gerrish and Wang, Chong and Jordan L. Boyd-graber and David M. Blei. Reading Tea Leaves: How Humans Interpret Topic Models. Advances in Neural Information Processing Systems, 2009.
Examples
## Not run:
data(politics)
poliClean <- cleanTexts(politics)
words10 <- makeWordlist(text=poliClean$text)
words10 <- words10$words[words10$wordtable > 10]
poliLDA <- LDAprep(text=poliClean$text, vocab=words10)
LDAresult <- LDAgen(documents=poliLDA, K=10, vocab=words10)
intruder <- intruderWords(beta=LDAresult$topics)
## End(Not run)