| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by leecommamichael 18 days ago
	These things don’t think. We’re going to have to reiterate this for a long time, I fear.

3 comments

emp17344 18 days ago

There is now a trillion-dollar industry bent to the task of convincing people these things can think. It’s gonna cause some damage.

link

suprfnk 18 days ago

I don't think they think. I still use them a lot despite that, because they are very powerful parameterised code generators.

link

akomtu 18 days ago

There is a movie, Gold (2016), about a fake gold mine. One of its founders is a true believer: he found a few chunks of gold and started digging for more. The other founder is a nihilist: he realised that there is no gold there, but who cares if he makes the investors believe? So he does, and almost sells the company for $300M.

In our story, investors are mining intelligence from GPUs, and they truly believe they are one inch from discovering the biggest goldmine in history. But GPUs, unlike a goldmine, cannot be inspected for traces of gold by independent contractors. To keep the hype up, the nihilists in our story dig up cheap gold-looking metals from time to time and tell investors that with a bit of alchemy - agentic workflows, etc. - those metals can be magically turned into gold.

Investors will keep digging until the end of the age, or until they run out of money.

link

sheeshkebab 18 days ago

…but they reason well enough given enough context (using their matmuls).

link

noosphr 18 days ago

To this day frontier models think that A and not B means A and B when the sentence gets pushed far enough back in their context window. The context length that model can reason over without obvious errors is much smaller than the advertised context. Between a 1/4th to a 1/20th what is advertised on the tin.

link

antonvs 18 days ago

Critiques like this tend to focus very hard on what models can't do. It's true, they have limitations.

But they're also superhuman in so many other ways. It's valid to point out limitations, but that doesn't support the conclusion that models are not incredibly powerful and capable of the functional equivalent of reasoning at human or superhuman levels in many scenarios.

link

noosphr 18 days ago

They may be better than humans at reasoning but they are substantially worse than the first generation logic programs from the 1950s.

link

cheevly 17 days ago

These types of comments help demonstrate first-hand how human reasoning stacks up against what an LLM would say in this situation.

link

leecommamichael 17 days ago

Agreed. Both are true. I sometimes think of the calculator as being superhuman as well.

link

antonvs 17 days ago

Yes, although the calculator couldn't "reason" the way ML models can.

All the political and emotional reactions to LLMs seem to obscure how absolutely amazing this technology is. I've pointed them at codebases I wrote entirely myself and had them find bugs, point things out I had missed, plan and implement refactorings to improve code quality, etc. I may be "smarter" than the models in some ways but there's no question they're smarter than me in others. They're unlike any tool we've ever had access to.

Yes, the politics and economics around them leaves a lot to be desired (read: is absolutely terrible), and there are a lot of valid justifications for the "AI backlash", but there's a very important baby in that bathwater.

link

Npovview 18 days ago

Do you also happen to remember what you ate last thrusday?

link

UncleEntity 18 days ago

"If you have a question look in the specification for the answer and don't just guess" seems a fairly important thing to remember for more than a couple of minutes...

link

Npovview 18 days ago

I had a coding session where I was doing stuff across two repositories. And CC forgot in exactly which repository a particular file was so it was grepping the parent directory. I just asked it to write all important key-value pairs which it thinks are important to a file and it never did parent directory grepping.

link

ethin 18 days ago

Do you have a point? Because last time I checked, AIs were supposed to be better than us fragile faulty humans, and weren't designed to emulate us and all our faults.

link

Npovview 17 days ago

If you have been following the news, harness is also a scaling direction now. Prompt your AI better not to forget relevant stuff or write them in a file which it can refer later. This way context can be refreshed, this is cached facts method or rolling window method of refreshing your memory just like you would ask a colleague to explain a concept again. These are solved problems.

link

ethin 17 days ago

Are they though? Because I really shouldn't have to use Claude Code (and I don't) just to get even decent results. As I said, I thought one of the biggest advantages AI was supposed to have was that it wouldn't need such constant reminding of things because it wasn't trying to emulate us faulty, forgetful, fragile humans who do have memory loss?

link

leecommamichael 18 days ago

Is that the same gap as what you’re responding to? To me, it seems his critique is about advertised capability and logical statements, and your rhetorical(?) question is about memory.

link