|
|
|
|
|
by code_runner
933 days ago
|
|
Whats more impressive is when GPT3.5 or 4 are capable of not just unscrambling, but answering questions about text that is flat out wrong. If you feed something like a bad transcript or some other very lossy (but not strictly scrambled) input.... it really can roll with it and just spit out correct information. Bad tokens in don't necessarily mean bad tokens out.... I'm sure there is a limit to how many tokens can be flat out bad before the "next token" in the response is thrown off, but after seeing what it can do with some of these inputs, the fact it can unscramble is not at all surprising/interesting. |
|
The encodings of LM's tokens reserve individual characters so that scrambled or new words can be encoded. And most LM's are trained on scrambled words as part of training copus, thus, they learn character-level embeddings.
Thus, basically, the paper is a very old news. This behavior is expected.