wordpiece_vocab {wordpiece.data} | R Documentation |
Load a wordpiece Vocabulary
Description
A wordpiece vocabulary is a named integer vector with class "wordpiece_vocabulary". The names of the vector are the tokens, and the values are the integer identifiers of those tokens. The vocabulary is 0-indexed for compatibility with Python implementations.
Usage
wordpiece_vocab(cased = FALSE)
Arguments
cased |
Logical; load the uncased vocabulary, or the cased vocabulary? |
Value
A wordpiece_vocabulary.
Examples
head(wordpiece_vocab())
head(wordpiece_vocab(cased = TRUE))
[Package wordpiece.data version 2.0.0 Index]