tokenize_spaces_punct {text.alignment} | R Documentation |
Tokenise text into a sequence of words
Description
Tokenise text into a sequence of words. The function uses strsplit
to split text into words
by using the [:space:] and [:punct:] character classes.
Usage
tokenize_spaces_punct(x)
Arguments
x |
a character string of length 1 |
Value
a character vector with the sequence of words in x
See Also
Examples
tokenize_spaces_punct("This just splits. Text.alongside\nspaces right?")
tokenize_spaces_punct("Also .. multiple punctuations or ??marks")
tokenize_spaces_punct("Joske Vermeulen")
[Package text.alignment version 0.1.4 Index]