validateTopic {validateIt} | R Documentation |
Create validation tasks for topic model selection
Description
Create validation tasks for topic model selection
Usage
validateTopic(type, n, text = NULL, vocab, beta, theta = NULL, thres = 20)
Arguments
type |
Task structures to be specified. Must be one of "WI" (word intrusion), "T8WSI" (top 8 word set intrusion), and "R4WSI" (random 4 word set intrusion). |
n |
The number of desired tasks |
text |
The pool of documents to be shown to the Mturk workers |
vocab |
A character vector specifying the words in the corpus. Usually, it can be found in topic model output. |
beta |
A matrix of word probabilities for each topic. Each row represents a topic and each column represents a word. Note this should not be in the logged form. |
theta |
A matrix of topic proportions. Each row represents a document and each clums represents a topic. Must be specified if task = "T8WSI" or "R4WSI". |
thres |
the threshold to draw words from, default to top 50 words. |
Details
Users need to fit their own topic models.
Value
A matrix of validation tasks. Each row represents a task and each column represents an aspect of a task, including the topic label, the document text (for "T8WSI" and "R4WSI"), and five words, including four non-intrusive words and one intrusive word.