get_tokens {syuzhet}R Documentation

Word Tokenization

Description

Parses a string into a vector of word tokens.

Usage

get_tokens(text_of_file, pattern = "\\W", lowercase = TRUE)

Arguments

text_of_file

A Text String

pattern

A regular expression for token breaking

lowercase

should tokens be converted to lowercase. Default equals TRUE

Value

A Character Vector of Words


[Package syuzhet version 1.0.7 Index]