| You're absolutely wrong! You can also ask an LLM to solve that problem by spelling the word out first. And then it'll count the letters successfully. At a similar success rate to actual nine-year-olds. There's a technical explanation for why that works, but to you, it might as well be black magic. And if you could get a modern agentic LLM that somehow still fails that test? Chances are, it would solve it with no instructions - just one "you're wrong". 1. The LLM makes a mistake 2. User says "you're wrong" 3. The LLM re-checks by spelling the word out and gives a correct answer 4. The LLM then keeps re-checking itself using the same method for any similar inquiry within that context In-context learning isn't replaced by anything better because it's so powerful that finding "anything better" is incredibly hard. It's the bread and butter of how modern LLM workflows function. |
In fact, asking a model not to repeat the same mistake makes it more likely to commit that mistake again, because it's in it's context.
I think anyone who uses LLMs a lot will tell your that your steps 3 and 4 are fictional.