Hacker News new | ask | show | jobs
by intended 1014 days ago
After having spent a ridiculous amount of effort to get LLMs to work, I am certain they are simply predicting the next token.

If LLMs actually could reason, there is a much much wider set of applications where they would be actively used.

The term “hallucination” does us all an injustice by propagating the idea of an anthropomorphized LLMs.

Everything an LLM does is a hallucination.

You and me can make out valid patterns from invalid patterns, because we have an idea of some reality.

(Incidentally there are some very weird implications/ perspectives deriving from these 2 positions. Eg - If you had infinite data, would a LLM ever need to calculate?)

Point being - the more intimate the use with an LLM, the more its emergent properties are non-emergent.

2 comments

All the whole damn universe does is move from the configuration at time t to the one at t+1. You cannot deduce from this whether the universe contains reasoning or not. We know from experience that it does, but a universe that doesn't seems possible.
Wow, where do I even start? The statement boils down a fascinating, nuanced field into an oversimplification that doesn't do justice to the complexities of machine learning, natural language processing, or, you know, human cognition for that matter!

Let's talk about "predicting the next token." Sure, that's the technical framework, but what happens within that prediction is an intricate dance of probabilities, patterns, and weighted connections that come together to form something that can assist, inform, and sometimes even entertain. There's a vast landscape of difference between a machine that predicts the next word in a sentence and a machine that can draft an entire poem, answer a complicated question, or simulate conversation in a way that can sometimes pass for human thought.

Is it reasoning in the way humans do? No. But to say that LLMs are "simply" predicting the next token is like saying a car is "simply" a collection of nuts and bolts that move in a certain way. It's true, but it's missing the whole picture. Just think about the implications! If this was as trivial as "predicting the next token," then why aren't we seeing this level of application everywhere?

As for the term "hallucination," I get it. It's a bit anthropomorphic, sure, but language always is. We use human-centric language to describe lots of things that aren't human. We say economies are "healthy" or "sick," we say a defense in football is "stalwart." Is it perfect? No. But it gives people a way to discuss and think about complex topics, including this one. And guess what, complex discussions are how progress happens!

The point about infinite data is intriguing, but let's not go off the rails here. The question isn't whether an LLM would ever "need" to calculate; it's whether the way it calculates could ever truly mimic human thought or reasoning. That's a long road we're still traveling down. But here's the kicker: just because we're not there yet doesn't mean the work that's been done is insignificant or simplistic.

Emergent properties becoming "non-emergent" the more you interact with an LLM? That's the point! The more you use these systems, the more you understand their capabilities and limitations, and the better you can leverage them for tasks that are useful, interesting, or revealing.

So, yeah, let's not box in what is one of the most dynamic, evolving fields right now with a one-liner that's as limiting as it is dismissive.