Hacker News new | ask | show | jobs
by inopinatus 821 days ago
Even this seems too grand a claim. I’d water it down thus: the LLM encodes that the token(s) for “Spot” are probabilistically plausible in the ensuing output.
1 comments

...because it understands what a dog name is. Why wouldn't you see Gary or Florence in that list? How does it know those aren't dog names?

You can't be suggesting it has memorized relationships between all concepts, the model would be enormous.

So clearly, there is something else going on. It's able to encode concepts/ideas.

The model is enormous, and N-dimensional for very high N. But the model remains insufficiently enormous for understanding, and moreover, the model cannot observe itself and adjust.

Ask an LLM to extrapolate, see any semblance of reason collapse.

Extrapolate what?