| tok-package | tok: Fast Text Tokenization |
| decoder_byte_level | Byte level decoder |
| encoding | Encoding |
| model_bpe | BPE model |
| model_unigram | An implementation of the Unigram algorithm |
| model_wordpiece | An implementation of the WordPiece algorithm |
| normalizer_nfc | NFC normalizer |
| normalizer_nfkc | NFKC normalizer |
| pre_tokenizer | Generic class for tokenizers |
| pre_tokenizer_byte_level | Byte level pre tokenizer |
| pre_tokenizer_whitespace | This pre-tokenizer simply splits using the following regex: \w+|[^\w\s]+ |
| processor_byte_level | Byte Level post processor |
| tok | tok: Fast Text Tokenization |
| tokenizer | Tokenizer |
| tok_decoder | Generic class for decoders |
| tok_model | Generic class for tokenization models |
| tok_normalizer | Generic class for normalizers |
| tok_processor | Generic class for processors |
| tok_trainer | Generic training class |
| trainer_bpe | BPE trainer |
| trainer_unigram | Unigram tokenizer trainer |
| trainer_wordpiece | WordPiece tokenizer trainer |