pre_tokenizer_whitespace {tok} | R Documentation |
This pre-tokenizer simply splits using the following regex: \w+|[^\w\s]+
Description
This pre-tokenizer simply splits using the following regex: \w+|[^\w\s]+
This pre-tokenizer simply splits using the following regex: \w+|[^\w\s]+
Super class
tok::tok_pre_tokenizer
-> tok_pre_tokenizer_whitespace
Methods
Public methods
Method new()
Initializes the whistespace tokenizer
Usage
pre_tokenizer_whitespace$new()
Method clone()
The objects of this class are cloneable with this method.
Usage
pre_tokenizer_whitespace$clone(deep = FALSE)
Arguments
deep
Whether to make a deep clone.
See Also
Other pre_tokenizer:
pre_tokenizer
,
pre_tokenizer_byte_level
[Package tok version 0.1.3 Index]