Hacker News new | ask | show | jobs
by gadtfly 1264 days ago
This is an artifact of an implementation-specific trick that trades performance at character-level tasks for performance at everything else. It does not reflect anything inherent about this type of model's capabilities: https://www.gwern.net/GPT-3#bpes

GPT-3 does not see individual characters. It sees "djsjcnnrjfkalcr" chunked as [d, js, jc, nn, r, j, f, k, al, cr]. You can see for yourself here: https://beta.openai.com/tokenizer.

7 comments

Tokenization doesn't explain this kind of a mistake. If you ask about "djsjc" you would get the proper answer of 5. The claim that this is performance trade-off does not hold.

I cannot edit the question, but would like to say that I'm extremely impressed by ChatGTP and entire question was an honest curiosity about the limitations of it. It is strange that many responses are about blaming my question and example as just wrong and not about the limitations of the ChatGPT model (admirable anyway).

Still, it makes for a great example of the difference between GPT-3 and an AGI. We would expect the latter to have enough self-awareness to recognize when it is being asked to do something beyond its abilities.
I guess this explains some recent weird behavior I saw: 1) it failed writing haikus (in japanese) and 2) it couldn't get quite right the task of generating poems without the letter e in them.
That doesn't explain why it gets things like "what is the second digit in 372?" wrong.

I think it's just fundamentally quite bad at numbers.

Nitpick: Gwern proposes this as a "plausible explanation" and does not make a definitive claim.
(It's not counting input tokens.)
So how come the answer is 18 instead of 10 then?
Because it's learned from people saying '$STRING is $N characters' a rough correlation between the token length of $STRING and $N. Given infinite training and depth, it would learn how strings tokenize and resolve the question more accurately, but this is basically it guessing what the inflation of tokens->chars is and missing.
But the correct answer is 15