|
|
|
|
|
by strangescript
323 days ago
|
|
"If a machine is consuming and transforming incalculable amounts of training data produced by humans, discussed by humans, and explained by humans. Why would the output not look identical to human reasoning? If I were to photocopy this article, nobody would argue that my photocopier wrote it and therefore can think. But add enough convolutedness to the process, and it looks a lot like maybe it did and can." But its not copying it. That is the entire point. Its using the training data to adjust floating point numbers. If you train on a single data piece over and over again, then yes it can replicate it, just like you can memorize lines of a school play, but its still not copied/compressed in the traditional, deterministic sense. You can't argue "we don't know how they work, or our own brains work with any certainty" and then over-trivialize what they do on the next argument. People suffer brain damage and come out the other side with radically different personalities. What happened to "qualia", or "sense of self", where is their "soul". Its just a mechanistic emergent property of their biological neural network. Who is to say our brains aren't just very high parameterized biological floating point machines? That is the true Occam's Razor here, as uncomfortable as that might make people. |
|
I believe it's quite possible that what is happening during training is in certain ways similar to what is happening to a child learning the world, although there are many practical differences (and I don't even mean the difference between human neurons and the ones in a neural network).
Is there anything to feel uncomfortable about? It's been a long time since people started discussing the concept of "a self doesn't exist, we're just X" where X was the newest concept popular during that time. I'm 100% sure LLMs are not the last one.
(BTW as for LLMs themselves, there are still two big engineering problems to solve: quite small context windows and hallucinations. The first requires a lot of money to solve, the second needs special approaches and a lot of trial and error to solve, and even then the last 1% might be almost impossible to get working reliably.)