|
|
|
|
|
by localfirst
697 days ago
|
|
> huge slice of human psyche encoded into it than the language itself and the corpus of texts in it This type of wording is problematic because it conflates what is written as representative of our psyche when it does not. Psyche implies thinking and thoughts that have occurred when the brain accessed those concepts from outside our 3D world, processed it internally, vocalized it into sounds and then finally letters. LLMs are just doing pattern text search on top of what is written, it is doing no sort of reasoning or accessing of the hyper dimensional plane like our brain does when it thinks or reasons with concepts. Our brains are not some exotic token or neurosymbolic search engines! |
|
The biological capabilities of a single human are also not very impressive, by the way. 90% (made up number) of what you consider your intelligence is actually the result of the biological evolution and social processes accumulating and abstracting the knowledge over endless generations. Hypothetical you raised without any contact with other humans, society, culture, education will be substantially different. So the processes are not just in your brain.
Whether you or me are doing "reasoning" is the matter of definition, and it's a really vague term. If you try to define it with more precision, you might come up with an idea that all we do is post-rationalizing the result of our blind prediction.
> This type of wording is problematic because it conflates what is written as representative of our psyche when it does not.
It definitely is representative, in some way. Human civilization did a huge amount of combined computation to encode the human behavior (personal, social, all kinds) into abstractions/semantics hidden in the language and text. Surely it can be recovered with some precision by statistical analysis and some computation. Which is what a large language model does.
Of course this "reverse engineering" approach has limitations. The model might not be able to generalize well enough to pick up higher level semantics. It might be architecture-limited. Some data might just not be in the dataset. The model will never be able to 100% copy humans without having an extremely precise biological reference, as well as you'll never be able to copy a dolphin, alien, or a model. But having an artificial human is not the point of this, and the achievable precision might be just good enough.