|
|
|
|
|
by yorwba
528 days ago
|
|
Ideally you'd have a language model that can predict a good continuation after any byte. If an existing model can't do that because it's too reliant on a specific tokenization, you might nonetheless be able to fine-tune it until it can gracefully handle the unexpected tokenizations that result from splitting at a random byte. |
|