FitCtmModel {textmineR} | R Documentation |
Fit a Correlated Topic Model
Description
A wrapper for the CTM function based on Blei's original code that returns a nicely-formatted topic model.
Usage
FitCtmModel(
dtm,
k,
calc_coherence = TRUE,
calc_r2 = FALSE,
return_all = TRUE,
...
)
Arguments
dtm |
A document term matrix of class |
k |
Number of topics |
calc_coherence |
Do you want to calculate probabilistic coherence of topics
after the model is trained? Defaults to |
calc_r2 |
Do you want to calculate R-squared after the model is trained?
Defaults to |
return_all |
Logical. Do you want the raw results of the underlying
function returned along with the formatted results? Defaults to |
... |
Other arguments to pass to CTM or TmParallelApply. See note below. |
Value
Returns a list with a minimum of two objects, phi
and
theta
. The rows of phi
index topics and the columns index tokens.
The rows of theta
index documents and the columns index topics.
Note
When passing additional arguments to CTM, you must unlist the
elements in the control
argument and pass them one by one. See examples for
how to dot this correctly.
Examples
# Load a pre-formatted dtm
data(nih_sample_dtm)
# Fit a CTM model on a sample of documents
model <- FitCtmModel(dtm = nih_sample_dtm[ sample(1:nrow(nih_sample_dtm) , 10) , ],
k = 3, return_all = FALSE)
# the correct way to pass control arguments to CTM
## Not run:
topics_CTM <- FitCtmModel(
dtm = nih_sample_dtm[ sample(1:nrow(nih_sample_dtm) , 10) , ],
k = 10,
calc_coherence = TRUE,
calc_r2 = TRUE,
return_all = TRUE,
estimate.beta = TRUE,
verbose = 0,
prefix = tempfile(),
save = 0,
keep = 0,
seed = as.integer(Sys.time()),
nstart = 1L,
best = TRUE,
var = list(iter.max = 500, tol = 10^-6),
em = list(iter.max = 1000, tol = 10^-4),
initialize = "random",
cg = list(iter.max = 500, tol = 10^-5)
)
## End(Not run)