validateTopic {validateIt}R Documentation

Create validation tasks for topic model selection

Description

Create validation tasks for topic model selection

Usage

validateTopic(type, n, text = NULL, vocab, beta, theta = NULL, thres = 20)

Arguments

type

Task structures to be specified. Must be one of "WI" (word intrusion), "T8WSI" (top 8 word set intrusion), and "R4WSI" (random 4 word set intrusion).

n

The number of desired tasks

text

The pool of documents to be shown to the Mturk workers

vocab

A character vector specifying the words in the corpus. Usually, it can be found in topic model output.

beta

A matrix of word probabilities for each topic. Each row represents a topic and each column represents a word. Note this should not be in the logged form.

theta

A matrix of topic proportions. Each row represents a document and each clums represents a topic. Must be specified if task = "T8WSI" or "R4WSI".

thres

the threshold to draw words from, default to top 50 words.

Details

Users need to fit their own topic models.

Value

A matrix of validation tasks. Each row represents a task and each column represents an aspect of a task, including the topic label, the document text (for "T8WSI" and "R4WSI"), and five words, including four non-intrusive words and one intrusive word.


[Package validateIt version 1.2.1 Index]