Hacker News new | ask | show | jobs
by jokethrowaway 527 days ago
I don't think llm generalise much, that's why they're not creative and can't solve novel problems. It's pattern matching with a huge amount of data.

Study on the topic: https://arxiv.org/html/2406.15992v1

This would explain o1 poor performance with problems with variations. o3 seems to be expensive brute forcing in latent space followed by verification which should yield better results - but I don't think we can call it generalisation.

I think we need to go back to the drawing board.

3 comments

From firsthand experience, this simply cannot be true. I can give them totally novel and unique physics problems I just made up- that requires tracking the movement of objects through a series of events, and it answers most correctly. Moreover, they find analogies between disparate concepts and fields of study and make useful suggestions based on them- which is arguably the same process as human creativity.

I think ultimately the disconnect is people theorizing about what it can or cannot do with an incorrect mental model of what it is, and then assuming it cannot do things that it can in fact do. The irony of discussions on LLMs is they more showcase the limits of humans ability to reason about novel situations.

Don't worry, there are thousands of researchers at the drawing boards right now.
Yeah, because if the AI boom becomes the AI bust, we'll have another 2008-level economic crisis on our hands.

The investments into AI are in the hundreds of billions (maybe even more if you factor in the amount of people studying and researching AI), but the returns are in the tens of billions (if even that).

If you exclude the "growth" coming from the industry sniffing its own farts (e.g. Nvidia selling insane amounts of insanely overpriced GPUs to InsertYourFavAICorp), the actual amount of "useful goods and services" produced (api accesses, chat subscriptions, ai-enabled app growth etc.) are tiny compared to the investment levels.

The AI train appears to have no brakes. A massive crash or AGI are the only options now. Both are going to be bad for average humans.

the fact that this (and tons of other legitimate critique) got downvoted into greytext speaks so much louder to me than all benchmarks in the world