|
|
|
|
|
by eldenring
1174 days ago
|
|
There might not be a "Z" token for some of these names. A made-up example is "Lukasz" might tokenize to ["Luk", "asz"], so the model doesn't have any notion of how words are actually spelled. I suspect that if the body of training data came with some instructions on spelling it would know how to do this better, but it seems unlikely that there would be a natural language description of how Polish (?) names are spelled in the training data. |
|