|
|
|
|
|
by maxbond
37 days ago
|
|
The escape algorithm here is very simple, you remove special tokens from the runtime tokenizer's vocabulary so that it's forced to encode them as multiple non-special tokens. (That doesn't actually mean the LLM won't treat them as special tokens though, so this isn't sufficient on it's own.) |
|
Doing this means that you can't just tokenize the string output of the chat template as one big string. You might need to tokenize things separately, and combine them after.