model_bpe {tok} | R Documentation |
BPE model
Description
BPE model
BPE model
Super class
tok::tok_model
-> tok_model_bpe
Methods
Public methods
Method new()
Initializes a BPE model An implementation of the BPE (Byte-Pair Encoding) algorithm
Usage
model_bpe$new( vocab = NULL, merges = NULL, cache_capacity = NULL, dropout = NULL, unk_token = NULL, continuing_subword_prefix = NULL, end_of_word_suffix = NULL, fuse_unk = NULL, byte_fallback = FALSE )
Arguments
vocab
A named integer vector of string keys and their corresponding ids. Default:
NULL
merges
A list of pairs of tokens (
[character, character]
). Default:NULL
.cache_capacity
The number of words that the BPE cache can contain. The cache speeds up the process by storing merge operation results. Default:
NULL.
dropout
A float between 0 and 1 representing the BPE dropout to use. Default:
NULL
unk_token
The unknown token to be used by the model. Default: 'NULL“'.
continuing_subword_prefix
The prefix to attach to subword units that don’t represent the beginning of a word. Default:
NULL
end_of_word_suffix
The suffix to attach to subword units that represent the end of a word. Default:
NULL
fuse_unk
Whether to fuse any subsequent unknown tokens into a single one. Default:
NULL
.byte_fallback
Whether to use the spm byte-fallback trick. Default:
FALSE
.
Method clone()
The objects of this class are cloneable with this method.
Usage
model_bpe$clone(deep = FALSE)
Arguments
deep
Whether to make a deep clone.
See Also
Other model:
model_unigram
,
model_wordpiece
,
tok_model