Byte Pair Encoding Text Tokenization


[Up] [Top]

Documentation for package ‘tokenizers.bpe’ version 0.1.3

Help Pages

belgium_parliament Dataset from 2017 with Questions asked in the Belgium Federal Parliament
bpe Construct a Byte Pair Encoding model
bpe_decode Decode Byte Pair Encoding sequences to text
bpe_encode Tokenise text alongside a Byte Pair Encoding model
bpe_load_model Load a Byte Pair Encoding model