tokenize_space {piecemaker} | R Documentation |
Break Text at Spaces
Description
This is an extremely simple tokenizer, breaking only and exactly on the space
character. This tokenizer is intended to work in tandem with
prepare_text
, so that spaces are cleaned up and inserted as
necessary before the tokenizer runs. This function and
prepare_text
are combined together in
prepare_and_tokenize
.
Usage
tokenize_space(text)
Arguments
text |
A character vector to clean. |
Value
The text as a list of character vectors (one vector per element of
text
). Each element of each vector is roughly equivalent to a word.
Examples
tokenize_space("This is some text.")
[Package piecemaker version 1.0.2 Index]