FindTopicsNumber {ldatuning} | R Documentation |
FindTopicsNumber
Description
Calculates different metrics to estimate the most preferable number of topics for LDA model.
Usage
FindTopicsNumber(
dtm,
topics = seq(10, 40, by = 10),
metrics = "Griffiths2004",
method = "Gibbs",
control = list(),
mc.cores = NA,
return_models = FALSE,
verbose = FALSE,
libpath = NULL
)
Arguments
dtm |
An object of class "DocumentTermMatrix" with term-frequency weighting or an object coercible to a "simple_triplet_matrix" with integer entries. |
topics |
Vector with number of topics to compare different models. |
metrics |
String or vector of possible metrics: "Griffiths2004", "CaoJuan2009", "Arun2010", "Deveaud2014". |
method |
The method to be used for fitting; see LDA. |
control |
A named list of the control parameters for estimation or an object of class "LDAcontrol". |
mc.cores |
NA, integer or, cluster; the number of CPU cores to process models simultaneously. If an integer, create a cluster on the local machine. If a cluster, use but don't destroy it (allows multiple-node clusters). Defaults to NA, which triggers auto-detection of number of cores on the local machine. |
return_models |
Whether or not to return the model objects of class "LDA. Defaults to false. Setting to true requires the tibble package. |
verbose |
If false (default), suppress all warnings and additional information. |
libpath |
Path to R packages (use only if your R installation can't find 'topicmodels' package, [issue #3](https://github.com/nikita-moor/ldatuning/issues/3). For example: "C:/Program Files/R/R-2.15.2/library" (Windows), "/home/user/R/x86_64-pc-linux-gnu-library/3.2" (Linux) |
Value
Data-frame with one or more metrics. numbers of topics and
corresponding values of metric. Can be directly used by
FindTopicsNumber_plot
to draw a plot.
Examples
## Not run:
library(topicmodels)
data("AssociatedPress", package="topicmodels")
dtm <- AssociatedPress[1:10, ]
FindTopicsNumber(dtm, topics = 2:10, metrics = "Arun2010", mc.cores = 1L)
## End(Not run)