|
|
|
|
|
by JustFinishedBSG
61 days ago
|
|
Interesting, it doesn't seem intuitive at all to me. My (wrong?) understanding was that there was a positive correlation between how "good" a tokenizer is in terms of compression and the downstream model performance. Guess not. |
|