Hacker News new | ask | show | jobs
by naasking 26 days ago
This is all speculative. We don't understand intelligence, so you literally have no idea whether what we recognize as intelligence is some suitable arrangement of "statistical token generation", especially once you add feedbacks loops.
2 comments

> "We don't understand intelligence, so you literally have no idea whether what we recognize as intelligence is some suitable arrangement of "statistical token generation""

Do you mean "token" as in the LLM sense?

Or are you thinking that thoughts in the human brain are also constructed out of some sort of underlying "token" even though the abstract thought happens and is held before any words are used to try to communicate that thought to an external party?

LLMs also don't run on tokens internally, they're just the inputs and outputs. The reasoning models do operate (partially) in the token space, but then so do I.
LLM's generate their output words sequentially based on probability (from learned stats).

Human's don't operate the same way, the thought happens and then the words are generated to reasonably describe that thought.

> the thought happens and then the words are generated to reasonably describe that thought.

Thoughts don't happen in a vacuum, they are triggered by external or internal stimuli, and these stimuli/thought precursors could very easily be tokens (dense info packets), which then map to latent space vectors, which very well could be thoughts.

Claims like "humans don't operate the same way" has no basis. Not only do we literally not know how humans operate mechanistically, and so we literally don't know the logical structure of human thought, but any system that is Turing complete is so easy to create that many wildly different mechanistic systems are fundamentally equivalent/interconvertible.

> Thoughts don't happen in a vacuum, they are triggered by external or internal stimuli, and these stimuli/thought precursors could very easily be tokens (dense info packets), which then map to latent space vectors, which very well could be thoughts.

Yes, possible, that's why I asked you above if that's what you meant by "token". Someone else responded and I didn't notice it wasn't you.

> Claims like "humans don't operate the same way" has no basis. Not only do we literally not know how humans operate mechanistically, and so we literally don't know the logical structure of human thought, but any system that is Turing complete is so easy to create that many wildly different mechanistic systems are fundamentally equivalent/interconvertible.

I think this position is too extreme, we do have some information.

We know how LLM's work when generating a sequence of words and I know that my brain does not work the same way for word generation because I am fully aware of the complete thought in advance of any words getting generated by me externally or internally.

I know prior to generating words that my thought is X and the words I'm about to produce need to express that thought.

But with LLM's we know that the essence of what they produce is not known in advance, that it must complete the word generation process to fully realize the end result and that multiple different end results are possible.

What I'm saying is that this is incorrect. An "idea" exists within a model before it generates tokens. This property does not distinguish humans from LLMs.

Additionally "from learned stats" doesn't disambiguate between a wider variety of things. I'm not aware of any other way to acquire knowledge from measurements. I'd bet that humans do this differently, based on the fact the humans can get further with less training data and that they learn actively during operation, but not so differently that 'learning stats' would be an inaccurate description.

> What I'm saying is that this is incorrect. An "idea" exists within a model before it generates tokens.

If that were the case, then the systems would generate words based on the fully resolved idea, but that is not how the LLM systems currently work (per vendors descriptions).

They choose words sequentially and both the specifics of the input as well as the chosen output words significantly impacts not just the rest of the output but the very correctness of the output.

> but not so differently that 'learning stats' would be an inaccurate description.

Agreed, humans are generalizing using some mechanism that can be modeled with math.

But the execution of our reasoning and thought processes is not obviously similar to LLM's next word generation based on probabilities.

>that is not how the LLM systems currently work (per vendors descriptions)

Anthropic says of the their model[0]:

"""Claude sometimes thinks in a conceptual space that is shared between languages, suggesting it has a kind of universal “language of thought.”

{...}

Claude will plan what it will say many words ahead, and write to get to that destination. We show this in the realm of poetry, where it thinks of possible rhyming words in advance and writes the next line to get there. This is powerful evidence that even though models are trained to output one word at a time, they may think on much longer horizons to do so."""

Anthropic also created 'golden gate claude'[1] by identifying the region of its architecture that corresponded to the concept of the golden gate bridge and activating it. What would such a region exist for if claude could only think one token at a time?

>the execution of our reasoning and thought processes is not obviously similar to LLM's

"Not obviously similar" I can agree with. I don't think you've identified a way in which they are obviously different, though.

[0] https://www.anthropic.com/research/tracing-thoughts-language...

[1] https://www.anthropic.com/news/golden-gate-claude

We understand it enough to see the obvious massive deficiencies in LLMs.

They can predict likely sentences but not evaluate truth or logic. They can fairly reliably record facts about the world but not construct internal models of the world.

> They can predict likely sentences but not evaluate truth or logic.

They do probabilistically. So do humans as a matter of fact. The best of us are better at it than LLMs, but that's not persuasive evidence of anything meaningful really.

> They can fairly reliably record facts about the world but not construct internal models of the world.

You don't know that, unless your presuppose a very specific definition of world model that necessarily precludes emergent ones.

Humans do not reason by guessing the next most likely token/word. They use logic, morality and systems of thought they have constructed and shared to help them reason and don’t in any way predict tokens in a sequence - we use words to represent our thoughts and feelings about the world, not to construct them.

You’re constructing a post-hoc fantasy of human thought based on how LLMs work because you are desperate for some reason to believe that they are thinking like humans, but they are not. The process is very different and the results are also different.