|
|
|
|
|
by akoboldfrying
155 days ago
|
|
I find this surprising, but I suppose it must be more efficient overall. Presumably parsing text into tokens is done in some deterministic way. If it is done by greedily taking the longest-matching prefix that is a token, then when generating text it should be possible to "enrich" tokens that are prefixes of other tokens with additional constraints to force a unique parse: E.g., if "e" is a token but "en" is too, then after generating "e" you must never generate a token that begins with "n". A text generated this way can be deterministically parsed by the greedy parser. Alternatively, it would suffice to restrict to a subset of tokens that are a prefix code. This would be simpler, but with lower coding efficiency. |
|
Regarding the second part: you'd effectively just be limiting yourself to single character tokens in that case which would drastically impact the LLM's output quality