model_wordpiece {tok}R Documentation

An implementation of the WordPiece algorithm

Description

An implementation of the WordPiece algorithm

An implementation of the WordPiece algorithm

Super class

tok::tok_model -> tok_model_wordpiece

Methods

Public methods


Method new()

Constructor for the wordpiece tokenizer

Usage
model_wordpiece$new(
  vocab = NULL,
  unk_token = NULL,
  max_input_chars_per_word = NULL
)
Arguments
vocab

A dictionary of string keys and their corresponding ids. Default: NULL.

unk_token

The unknown token to be used by the model. Default: NULL.

max_input_chars_per_word

The maximum number of characters to allow in a single word. Default: NULL.


Method clone()

The objects of this class are cloneable with this method.

Usage
model_wordpiece$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

See Also

Other model: model_bpe, model_unigram, tok_model


[Package tok version 0.1.3 Index]