|
|
|
|
|
by ipieter
57 days ago
|
|
There is currently very little evidence that morphological tokenizers help model performance [1]. For languages like German (where words get glued together) there is a bit more evidence (eg a paper I worked on [2]), but overall I start to suspect the bitter lesson is also true for tokenization. [1] https://arxiv.org/pdf/2507.06378 [2] https://pieter.ai/bpe-knockout/ |
|