| encoding {tok} | R Documentation |
Encoding
Description
Represents the output of a tokenizer.
Value
An encoding object containing encoding information such as attention masks and token ids.
Public fields
.encodingThe underlying implementation pointer.
Active bindings
idsThe IDs are the main input to a Language Model. They are the token indices, the numerical representations that a LM understands.
attention_maskThe attention mask used as input for transformers models.
Methods
Public methods
Method new()
Initializes an encoding object (Not to use directly)
Usage
encoding$new(encoding)
Arguments
encodingan encoding implementation object
Method clone()
The objects of this class are cloneable with this method.
Usage
encoding$clone(deep = FALSE)
Arguments
deepWhether to make a deep clone.
Examples
withr::with_envvar(c(HUGGINGFACE_HUB_CACHE = tempdir()), {
try({
tok <- tokenizer$from_pretrained("gpt2")
encoding <- tok$encode("Hello world")
encoding
})
})
[Package tok version 0.1.3 Index]