Y
Hacker News
new
|
ask
|
show
|
jobs
by
Omie6541
1280 days ago
can we please have some good example input/outputs in the readme itself? what is the expected output of print(enc.encode("hello world")) ?
2 comments
mnks
1280 days ago
Have a look at
https://beta.openai.com/tokenizer
which uses javascript reimplementation of the GPT-2 / GPT-3 BPE tokenizer. In this case it's [31373, 995].
link
stabbles
1280 days ago
https://github.com/openai/tiktoken/blob/main/tests/test_simp...
link