coherence {sentopics} | R Documentation |
Coherence of estimated topics
Description
Computes various coherence based metrics for topic models. It
assesses the quality of estimated topics based on co-occurrences of words.
For best results, consider cleaning the initial tokens object with padding = TRUE
.
Usage
coherence(
x,
nWords = 10,
method = c("C_NPMI", "C_V"),
window = NULL,
NPMIs = NULL
)
Arguments
x |
a model created from the |
nWords |
the number of words in each topic used for evaluation. |
method |
the coherence method used. |
window |
optional. If |
NPMIs |
optional NPMI matrix. If provided, skip the computation of NPMI between words, substantially decreasing computing time. |
Details
Currently, only C_NPMI and C_V are documented. The implementation follows Röder & al. (2015). For C_NPMI, the sliding window is 10 whereas it is 110 for C_V.
Value
A vector or matrix containing the coherence score of each topic.
Author(s)
Olivier Delmarcelle
References
Röder, M., Both, A., & Hinneburg, A. (2015). Exploring the Space of Topic Coherence Measures. In Proceedings of the Eighth ACM International Conference on Web Search and Data Mining, 399-–408.