frequentwords {fdm2id} | R Documentation |
Frequent words
Description
Most frequent words of the corpus.
Usage
frequentwords(
corpus,
nb,
mincount = 5,
minphrasecount = NULL,
ngram = 1,
lang = "en",
stopwords = lang
)
Arguments
corpus |
The corpus of documents (a vector of characters) or the vocabulary of the documents (result of function |
nb |
The number of words to be returned. |
mincount |
Minimum word count to be considered as frequent. |
minphrasecount |
Minimum collocation of words count to be considered as frequent. |
ngram |
maximum size of n-grams. |
lang |
The language of the documents (NULL if no stemming). |
stopwords |
Stopwords, or the language of the documents. NULL if stop words should not be removed. |
Value
The most frequent words of the corpus.
See Also
Examples
## Not run:
text = loadtext ("http://mattmahoney.net/dc/text8.zip")
frequentwords (text, 100)
vocab = getvocab (text)
frequentwords (vocab, 100)
## End(Not run)
[Package fdm2id version 0.9.9 Index]