| model_bpe {tok} | R Documentation |
BPE model
Description
BPE model
BPE model
Super class
tok::tok_model -> tok_model_bpe
Methods
Public methods
Method new()
Initializes a BPE model An implementation of the BPE (Byte-Pair Encoding) algorithm
Usage
model_bpe$new( vocab = NULL, merges = NULL, cache_capacity = NULL, dropout = NULL, unk_token = NULL, continuing_subword_prefix = NULL, end_of_word_suffix = NULL, fuse_unk = NULL, byte_fallback = FALSE )
Arguments
vocabA named integer vector of string keys and their corresponding ids. Default:
NULLmergesA list of pairs of tokens (
[character, character]). Default:NULL.cache_capacityThe number of words that the BPE cache can contain. The cache speeds up the process by storing merge operation results. Default:
NULL.dropoutA float between 0 and 1 representing the BPE dropout to use. Default:
NULLunk_tokenThe unknown token to be used by the model. Default: 'NULL“'.
continuing_subword_prefixThe prefix to attach to subword units that don’t represent the beginning of a word. Default:
NULLend_of_word_suffixThe suffix to attach to subword units that represent the end of a word. Default:
NULLfuse_unkWhether to fuse any subsequent unknown tokens into a single one. Default:
NULL.byte_fallbackWhether to use the spm byte-fallback trick. Default:
FALSE.
Method clone()
The objects of this class are cloneable with this method.
Usage
model_bpe$clone(deep = FALSE)
Arguments
deepWhether to make a deep clone.
See Also
Other model:
model_unigram,
model_wordpiece,
tok_model