R: Calculate a matrix whose rows represent P(topic

CalcGamma {textmineR}

R Documentation

Calculate a matrix whose rows represent P(topic_i|tokens)

Description

This function takes a phi matrix (P(token|topic)) and a theta matrix (P(topic|document)) and returns the phi prime matrix (P(topic|token)). Phi prime can be used for classifying new documents and for alternative topic labels.

Usage

CalcGamma(phi, theta, p_docs = NULL, correct = TRUE)

Arguments

`phi`	The phi matrix whose rows index topics and columns index words. The i, j entries are P(word_i \| topic_j)
`theta`	The theta matrix whose rows index documents and columns index topics. The i, j entries are P(topic_i \| document_j)
`p_docs`	A numeric vector of length `nrow(theta)` that is proportional to the number of terms in each document. This is an optional argument. It defaults to NULL
`correct`	Logical. Do you want to set NAs or NaNs in the final result to zero? Useful when hitting computational underflow. Defaults to `TRUE`. Set to `FALSE` for troubleshooting or diagnostics.

Value

Returns a matrix whose rows correspond to topics and whose columns correspond to tokens. The i,j entry corresponds to P(topic_i|token_j)

Examples

# Load a pre-formatted dtm and topic model
data(nih_sample_topic_model) 

# Make a gamma matrix, P(topic|words)
gamma <- CalcGamma(phi = nih_sample_topic_model$phi, 
                   theta = nih_sample_topic_model$theta)

[Package textmineR version 3.0.5 Index]