Hacker News new | ask | show | jobs
by JustFinishedBSG 61 days ago
Interesting, it doesn't seem intuitive at all to me.

My (wrong?) understanding was that there was a positive correlation between how "good" a tokenizer is in terms of compression and the downstream model performance. Guess not.