Hacker News new | ask | show | jobs
by Omie6541 1280 days ago
can we please have some good example input/outputs in the readme itself? what is the expected output of print(enc.encode("hello world")) ?
2 comments

Have a look at https://beta.openai.com/tokenizer which uses javascript reimplementation of the GPT-2 / GPT-3 BPE tokenizer. In this case it's [31373, 995].