Hacker News new | ask | show | jobs
by ttul 1015 days ago
Falcon fails. GPT-3.5 also fails this test. GPT-4 gets it right. I suspect that GPT-4 is just large enough to have developed a concept of counting, whereas the others are not. Alternatively, it's possible that GPT-4 has memorized the answer from its more extensive training set.
1 comments

It's not possible to count letters for an LLM; it only "sees" tokens.