Hacker News new | ask | show | jobs
by raincole 9 days ago
> What I mean, is the LLM is able to represent things in space . That part I don't understand.

Why do you think this is mutually exclusive to "LLM predicts the next token"?

If you tell someone from 19th century that bytes (just 0s and 1s!) can represent an opera, a song, or even a whole interactive experience, they might be really confused. But there is no reason they can't.

If you tell someone without math background that the sums of smaller and smaller sin waves can represent pretty much anything in our universe, they might be really confused. But there is no reason they can't.

There is simply no reason that a next-token predicator can't generate a nice-looking checkbox.

1 comments

You're talking about simple compression and encoding mechanisms and by implication you're drawing an analogy to an LLM encoding/compressing the information..

And sure, it does, but the person you're replying to was trying to understand why it also seems to reason about the query to give an answer consistent with it, despite not being trained on that query or answer. Your answer seems to imply that its just another slick complex encoding.

But the emergent property of trillions of digital neurons predicting the next token is that in the process of being trained to do so, they can also learn to reason.

At some scale, it is efficient to encode cognition which is capable of mimicing the cognition which generated the input tokens.