get_tokens {syuzhet} | R Documentation |
Word Tokenization
Description
Parses a string into a vector of word tokens.
Usage
get_tokens(text_of_file, pattern = "\\W", lowercase = TRUE)
Arguments
text_of_file |
A Text String |
pattern |
A regular expression for token breaking |
lowercase |
should tokens be converted to lowercase. Default equals TRUE |
Value
A Character Vector of Words
[Package syuzhet version 1.0.7 Index]