| prune_vocabulary {text2vec} | R Documentation | 
Prune vocabulary
Description
This function filters the input vocabulary and throws out very
frequent and very infrequent terms. See examples in for the
vocabulary function. The parameter vocab_term_max can
also be used to limit the absolute size of the vocabulary to only the most
frequently used terms.
Usage
prune_vocabulary(vocabulary, term_count_min = 1L, term_count_max = Inf,
  doc_proportion_min = 0, doc_proportion_max = 1, doc_count_min = 1L,
  doc_count_max = Inf, vocab_term_max = Inf)
Arguments
| vocabulary | a vocabulary from the vocabulary function. | 
| term_count_min | minimum number of occurences over all documents. | 
| term_count_max | maximum number of occurences over all documents. | 
| doc_proportion_min | minimum proportion of documents which should contain term. | 
| doc_proportion_max | maximum proportion of documents which should contain term. | 
| doc_count_min | term will be kept number of documents contain this term is larger than this value | 
| doc_count_max | term will be kept number of documents contain this term is smaller than this value | 
| vocab_term_max | maximum number of terms in vocabulary. | 
See Also
[Package text2vec version 0.6.4 Index]