R: Frequent words

frequentwords {fdm2id}

R Documentation

Frequent words

Description

Most frequent words of the corpus.

Usage

frequentwords(
  corpus,
  nb,
  mincount = 5,
  minphrasecount = NULL,
  ngram = 1,
  lang = "en",
  stopwords = lang
)

Arguments

`corpus`	The corpus of documents (a vector of characters) or the vocabulary of the documents (result of function `getvocab`).
`nb`	The number of words to be returned.
`mincount`	Minimum word count to be considered as frequent.
`minphrasecount`	Minimum collocation of words count to be considered as frequent.
`ngram`	maximum size of n-grams.
`lang`	The language of the documents (NULL if no stemming).
`stopwords`	Stopwords, or the language of the documents. NULL if stop words should not be removed.

Value

The most frequent words of the corpus.

Examples

## Not run: 
text = loadtext ("http://mattmahoney.net/dc/text8.zip")
frequentwords (text, 100)
vocab = getvocab (text)
frequentwords (vocab, 100)

## End(Not run)

[Package fdm2id version 0.9.9 Index]

Frequent words

Description

Usage

Arguments

Value

See Also

Examples