Hacker News new | ask | show | jobs
by gpderetta 1101 days ago
> It’s an analogy for how LLMs work. An LLM does not know anything, it just adds tokens probabilistically based on the previous tokens

This seem a deep statement that keeps getting repeated, but it doesn't mean anything. The probabilistic model that is used to decide the next token could be arbitrarily complex, including encoding knowledge (or just asking a panel of experts).

It seems pretty self evident that the model in fact encodes knowledge, just in a very lossy way and recall is also flawed.

1 comments

It sure does encode some knowledge, because it's a language model and languages already do so on their own. It's far from what you'd usually call a "knowledge model" though.