| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by ACCount37 63 days ago

You're absolutely wrong!

You can also ask an LLM to solve that problem by spelling the word out first. And then it'll count the letters successfully. At a similar success rate to actual nine-year-olds.

There's a technical explanation for why that works, but to you, it might as well be black magic.

And if you could get a modern agentic LLM that somehow still fails that test? Chances are, it would solve it with no instructions - just one "you're wrong".

1. The LLM makes a mistake

2. User says "you're wrong"

3. The LLM re-checks by spelling the word out and gives a correct answer

4. The LLM then keeps re-checking itself using the same method for any similar inquiry within that context

In-context learning isn't replaced by anything better because it's so powerful that finding "anything better" is incredibly hard. It's the bread and butter of how modern LLM workflows function.

2 comments

squeaky-clean 63 days ago

This is false. You can ask it to spell out strawberry and count the letters and it will still say 2 (it's unable to actually count the letters by the way). The only way to get a model that believes strawberry has 2 R's to consistently give the correct answer is to ask it to code the problem and return the output.

In fact, asking a model not to repeat the same mistake makes it more likely to commit that mistake again, because it's in it's context.

I think anyone who uses LLMs a lot will tell your that your steps 3 and 4 are fictional.

link

ACCount37 63 days ago

Have you actually tried?

The "spell out" trick, by the way, was what was added to the system prompts of frontier models back when this entire meme was first going around. It did mitigate the issue.

link

bigyabai 63 days ago

> it's so powerful that finding "anything better" is incredibly hard.

We're back around to the start again. "Incredibly hard" is doing all of the heavy lifting in this statement, it's not all-powerful and there are enormous failure cases. Neither the human brain nor LLMs are a panacea for thought, but nobody in academia or otherwise is seriously comparing GPT to the human brain. They're distinct.

> There's a technical explanation for why that works, but to you, it might as well be black magic.

Expound however much you need. If there's one thing I've learned over the past 12 months it's that everyone is now an expert on the transformer architecture and everyone else is wrong. I'm all ears if you've got a technical argument to make, the qualitative comparison isn't convincing me.

link

ACCount37 63 days ago

I do know far more than you, which is a laughably low bar. If you want someone to hold your hand through it, ask an LLM.

The key words are "tokenization" and "metaknowledge", the latter being the only non-trivial part. An LLM can explain it in detail. They know more than you do too.

link