|
|
|
|
|
by omneity
187 days ago
|
|
While an LLM is trained on trillions of tokens to acquire its capabilities, it does not actively retain or recall the vast majority of it, and often enough is not able to make deductive reasoning either (e.g. X owns Y does not necessarily translate to Y belongs to X). The acquired knowledge is a lot less uniform than you’re proposing and in fact is full of gaps a human would never make. And more critically, it is not able to peer into all of its vast knowledge at once, so with every prompt what you get is closer to an “instance of a human” than “all of humanity” as you might think of LLMs. (I train and dissect LLMs for a living and for fun) |
|
They mentioned the training data is much higher for an LLM, LLM's recall not being uniform was never in question.
No one expects compression to be without loss when you scale below knowledge entropy that exists in your training set.
I am not saying LLMs do simple compression but just pointing a mathematical certainity.
(And I think you don't need to be an expert in creating LLMs to understand them, albeit I think a lot of people here have experience with it aswell so I find the additional emphasis on it moot).