|
|
|
|
|
by Chabsff
213 days ago
|
|
Yeah, but that's their interface. That informs surprisingly little about their inner workings. ANNs are arbitrary function approximators. The training process uses statistical methods to identify a set of parameters that approximate the function as best as possible. That doesn't necessarily mean that the end result is equivalent to a very fancy multi-stage linear regression. It's a possible outcome of the process, but it's not the only possible outcome. Looking at a LLMs I/O structure and training process is not enough to conclude much of anything. And that's the misconception. |
|
I'm not sure I follow. LLMs are probabilistic next-token prediction based on current context, that is a factual, foundational statement about the technology that runs all LLMs today.
We can ascribe other things to that, such as reasoning or knowledge or agency, but that doesn't change how they work. Their fundamental architecture is well understood, even if we allow for the idea that maybe there are some emergent behaviors that we haven't described completely.
> It's a possible outcome of the process, but it's not the only possible outcome.
Again, you can ascribe these other things to it, but to say that these external descriptions of outputs call into question the architecture that runs these LLMs is a strange thing to say.
> Looking at a LLMs I/O structure and training process is not enough to conclude much of anything. And that's the misconception.
I don't see how that's a misconception. We evaluate all pretty much everything by inputs and outputs. And we use those to infer internal state. Because that's all we're capable of in the real world.