Y
Hacker News
new
|
ask
|
show
|
jobs
by
isaacfung
979 days ago
I think that is partly why LLMs are bad at math and often fail at counting subsequences. Play with the tokenizer and you see long numbers are split into groups of 2 or 3 numbers.
https://huggingface.co/spaces/Xenova/the-tokenizer-playgroun...