|
|
|
|
|
by anonymoushn
53 days ago
|
|
their old tokenizer performed some space collapsing that allowed them to use the same token id for a word with and without the leading space (in cases where the context usually implies a space and one is not present, a "no space" symbol is used). |
|