termsDictionary {RcmdrPlugin.temis} | R Documentation |
Dictionary of terms found in a corpus
Description
List all of the words that were found in the corpus, and stemmed terms present in the document-term matrix, together with their number of occurrences.
Usage
termsDictionary(dtm, order = c("alphabetic", "occurrences"))
Arguments
dtm |
a document-term matrix. |
order |
whether to sort words alphabetically, or by number of (stemmed) occurrences. |
Details
Words found in the corpus before stopwords removal and stemming are printed, together with the corresponding stemmed term that was eventually added to the document-term matrix, if stemming was enabled. Occurrences found before and after stemming are also shown.
The column “Stopword?” indicates whether the corresponding word is present in the list of stopwords for the corpus language. Words that were actually removed, either automatically by stopwords removal at import time, or manually via the Text mining->Terms->Exclude terms from analysis... menu, are signalled in the “Removed?” column. All other words are present in the final document-term matrix, in their original or in their stemmed form.
See Also
DocumentTermMatrix
, restrictTermsDlg
, freqTermsDlg
,
termCoocDlg