Hacker News new | ask | show | jobs
by isaacfung 749 days ago
The current gen llms tokenize numbers digit by digit unlike earlier llms.
2 comments

They don't. Which you can easily check with any of the dozen web apps currently implementing the GPT-4o tokenizer.
No, it doesn't help. Bloomberg tried this and it didn't seem to make much difference.
If someone else is interested in the Bloomberg tokenizer:

https://medium.com/generative-ai-insights-for-business-leade...