|
|
|
|
|
by dnautics
128 days ago
|
|
a tombstone "token "doesnt have to be an actual token, nor does it have to be explicitly carved out into the tokenizer. it can be learned. unless you have looked into the activations of a SOTA llm you cant categorically say that one (or 80% of one, fir example) doesn't exist. |
|
Current LLM implementations cannot delete output text, i.e. they cannot remove text from their context window. The recursive application is such that outputs are always expanding what is in the window, so there is no backtracking like humans can do, i.e. "this text was bad, ignore it and remove from context". That's part of why we got crazy loops / spirals like we did with the "show me the seahorse emoji" prompts.
Backtracking needs more than just a special token or cluster of tokens, but also for the LLM behaviour to be modified when it sees that token or token cluster. This must be manually coded in, it cannot be learned.