Hacker News new | ask | show | jobs
by kybernetikos 471 days ago
> I think we have a pretty good idea that we are not stochastic parrots - sophisticated or not. Anyone suggesting that we’re running billion parameter models

On the contrary, we have 86B neurons in the brain, the weighting of the connections is the important thing, but we are definitely 'running' a model with many billions of parameters to produce our output.

The theory by which the brain mainly works by predicting the next state is called predictive coding theory, and I would say that I find it pretty plausible. At the very least, we are a long way from knowing for certain that we don't work in this way.

2 comments

> On the contrary, we have 86B neurons in the brain

The neurons (cells) in even a fruit flies brain are orders of magnitude more complex than the "neurons" (theoretical concept) in a neural net.

> the weighting of the connections is the important thing

In a neural net, sure.

In a biological brain, many more factors are important: The existence of a pathway. Antagonistic neurotransmitters. NT re-incorporation. NT-binding sensitivity. Excitation potential. Activity of Na/K channels. Moderating enzymes.

Even what we last ate or drank, how rested, old, hydrated, we are, when our lats physical activity took place, and all the interactions prior to an input influence how we analyse and integrate it.

> but we are definitely 'running' a model with many billions of parameters to produce our output.

No, we are very definitely not. Many of our mental activities have nothing to do with state prediction at all.

We integrate information.

We exist as a conscious agent in the world. We interact, and by doing so change our own internal state alongside the information we integrate. We are able to, from this, simulate our own actions and those of other agents, and model the world around us, and then model how an interaction with that world would change the model.

We are also able to model abstract concepts both in and outside the world.

We understand what concepts, memories, states, and information mean both as abstract concepts and concrete entities in the universe.

We communicate with other agents, simultaneously changing their states and updating our modeling of their internal state (theory of the mind, I know that you know that I know, ...)

We filter, block, change, and create information.

And of course we constantly learn and change the way we do ALL OF THIS, consciously and subconsciously.

> At the very least, we are a long way from knowing for certain that we don't work in this way.

https://en.wikipedia.org/wiki/Russell%27s_teapot

OK, let me be more clear, because I'm not sure what you're arguing against.

If the process in the brain is modellable at all, then it is certainly a model with at a minimum many billions of parameters. Your list of additional parameters if anything supports that rather than arguing against it. If you want to argue with that contention, I think you need to argue that the process isn't modellable, which if you want to talk about burden of proof, would place a huge burden on you. But maybe I misunderstood you. I thought you were saying that it's ludicrous to say we're using as many as billions of parameters, but perhaps you're trying to say that billions is obviously far too small, in which case I agree.

My second point, which is that there's a live theory that prediction may be a core element of our consciousness was intended as an interesting aside, I don't know how it will stand the test of time and I certainly don't know if its correct or not, I intended only to use it to prove that the things you seem to think are obvious are not in fact obvious to everyone.

For example, that big list of things that you are using as an argument against prediction doesn't work at all because you don't know whether they are implemented via a predictive process in the brain or not.

It feels that rather than arguing against modellability or large numbers of parameters or prediction you're arguing against the notion that the human brain is exactly an llm, which is an idea so obviously true I don't think anyone actually disagrees with it.

> Your list of additional parameters if anything supports that rather than arguing against it.

> perhaps you're trying to say that billions is obviously far too small, in which case I agree.

No, it doesn't, and I don't.

The processes that happen in a living brain don't just map to "more params". It doesn't matter how many learnable parameters you have...unless you actually change the paradigm, an LLM or similar construct is incapable of mapping a brain, period. The simple fact that the brains internal makeup is itself changeable, already prevents that.

> prediction may be a core element of our consciousness

No it isn't, and it's trivially easy to show that.

Many meditative techniques exist where people "empty their mind". They don't think or predict anything. Does that stop consiousness? Obviously not.

Can we do prediction? Sure. Is it a "core element", aka. indispensable for consciousness? No.

I am not a neuroscientist, but I think it's likely that LLMs (with 10s/100s of billions of parameters) and the human brain (with 1-2 orders of magnitude more neural connections[1]) process language in analogous ways. This process is predictive, stochastic, sensitive to constantly-shifting context, etc. IMO this accounts for the "unreasonable effectiveness" of LLMs in many language-related tasks. It's reasonable to call this a form of intelligence (you can measure it, solve problems with it, etc).

But language processing is just one subset of human cognition. There are other layers of human experience like sense-perception, emotion, instinct, etc. – maybe these things could be modeled by additional parameters, maybe not. Additionally, there is consciousness itself, which we still have a poor understanding of (but it's clearly different from intelligence).

So anyway, I think that it's reasonable to say that LLMs implement one sub-set of human cognition (the part that has to do with how we think in language), but there are many additional "layers" to human experience that they don't currently account for.

Maybe you could say that LLMs are a "model distillation" of human intelligence, at 1-2 orders of magnitude less complexity. Like a smaller model distilled from a larger one, they are good at a lot of things but less able to cover edge cases and accuracy/quality of thinking will suffer the more distilled you go.

We tend to equate "thinking" with intelligence/language/reason thanks to 2500 years of Western philosophy, and I believe that's where a lot of confusion originates in discussions of AI/AGI/etc.

[1]: https://medicine.yale.edu/lab/colon-ramos/overview/#:~:text=...

>I am not a neuroscientist, but I think it's likely that LLMs (with 10s of billions of parameters) and the human brain (with 1-2 orders of magnitude more neural connections[1]) process language in analogous ways

Related is the platonic representation hypothesis where models apparently converge to similar representations of relationships between data points.

https://phillipi.github.io/prh/ https://arxiv.org/abs/2405.07987

Interesting. I'm not sure I'd use the term "Platonic" here, because that tends to have implications of mathematical perfection / timelessness / etc. But I do think that the corpuses of human language that we've been feeding to these models contain within them a lot of real information about the objective world (in a statistical, context-dependent way as opposed to a mathematically precise one), and the AIs are surfacing this information.

To put this another way, I think that you can say that much of our own intelligence as humans is embedded in the sum total of the language that we have produced. So the intelligence of LLMs is really our own intelligence reflected back at us (with all the potential for mistakes and biases that we ourselves contain).

Edit: I fed Claude this paper, and "he" pointed out to me that there are several examples of humans developing accurate conceptions of things they could never experience based on language alone. Most readers here are likely familiar with Helen Keller, who became an accomplished thinker and writer in spite of being blind and deaf from infancy (Anne Sullivan taught her language despite great difficulty, and this Keller's main window to the world). You could also look at the story of Eşref Armağan, a Turkish painter who was blind from birth – he creates recognizable depictions of a world that he learned about through language and non-visual senses).

Try taking any of the LLM models we have, and making it learn (adjust its weights) based on every interaction with it. You'll see it quickly devolves into meaninglessness. And yet we know for sure that this is what happens in our nervous system.

However, this doesn't mean in any way that an LLM might not produce the same or even superior output than a human would in certain very useful circumstances. It just means it functions fundamentally differently on the inside.

Maybe this is just a conversation about what "fundamentally differently" means then.

Obviously the brain isn't running an exact implementation of the attention paper, and your point about how the brain is more malleable than our current llms is a great point, but that just proves they aren't the same. I fully expect that future architectures will be more malleable, if you think that such hypothetical future architectures will be fundamentally different from the current ones then we agree..