createDTM {tmcn} | R Documentation |
Create a Chinese term-document matrix or a document-term matrix.
Description
Create a Chinese term-document matrix or a document-term matrix.
Usage
createDTM(string, language = c("zh", "en"), tokenize = NULL, removePunctuation = TRUE,
removeNumbers = TRUE, removeStopwords = TRUE)
createTDM(string, language = c("zh", "en"), tokenize = NULL, removePunctuation = TRUE,
removeNumbers = TRUE, removeStopwords = TRUE)
Arguments
string |
A character vector. |
language |
The language type, 'zh' means Chinese. |
tokenize |
A tokenizers function. |
removePunctuation |
Whether to remove the punctuations. |
removeNumbers |
Whether to remove the numbers. |
removeStopwords |
Whether to remove the stop words. |
Details
Package "tm" is required.
Value
An object of class TermDocumentMatrix
or class DocumentTermMatrix
.
Author(s)
Jian Li <rweibo@sina.com>
[Package tmcn version 0.2-13 Index]