Hacker News new | ask | show | jobs
by ACCount37 6 days ago
And knowing that structure is about as meaningful as knowing "a PC consists of a keyboard, on which you type, a screen, at which you look, and a processor, which does things with binary logic".

None of that helps you understand how exactly LLMs do what they do. Because it describes an interface, not a mechanism.

The inner mechanisms of an LLM are more learned than designed. We know what an LLM does on a low level, but going from that to understanding how they work is like trying to understand how a web browser works by looking at netlists of a CPU. Low level understanding does not grant you high level understanding for free.

But ignoring all of that lets you cling to a very comforting "we understand LLMs because we made them". Ha ha. As if.

> And we also know that human beings do not hold 'internal representations' like any AI system needs to.

Bold fucking claim. Got a source on that?

Because neurobiology has been trying to crack neural representations - the very internal representations brains use - for as long as it existed, and with some success. Both reading and injecting internal representations into the brain is possible now, in narrow cases. The specifics vary region to region, but sparse population coding is a true staple. Today's SOTA for wrangling this mess is ML decoders, and not by a coincidence.

1 comments

We know how LLMs learn at the fundamental level. What we do not know is the actual dynamic process of encoding embeddings and their distributions.

Your analogies about the PC and web browser are not correctly formulated, because in the case of the PC you talk about 'external components' (you should be talking about cpu arch, structure, digital components, interfaces, etc); in the case of the web browser, you should be talking about modules, code, etc.

We do know how LLMs are laid out: layers, att heads, etc. So what we need to look at are the fundamental possibilities of the structure of LLMs, not how the weights are distributed.

> > And we also know that human beings do not hold 'internal representations' like any AI system needs to.

> Bold fucking claim. Got a source on that?

Part of the sources are in the books I mentioned. Nonetheless, you can still fact-check and refute in an adult and serious manner, not in an disrespectful and arrogant way. If my claim sounded arrogant I apologize, but then as I already mentioned, my references back that claim.

Regarding internal representations in the brain: I guess you are referring to areas of the brain being activated when a subject receives a stimuli, and this is tested through MRI. I would be cautious to causally relate stimuli to neuron activations, since you first need to know if the exact configuration of cell involved and their connections allow for such representation (which I think it is still not known -- again, AFAIK, the contrary seems to be the case).

Your references that "back that claim", which are in "books you mentioned", which you "mentioned" who knows where.

Yeah, no. I'm not walking that chain. If you want to, do it, but for now, I'm filing it as "has no evidence and knows it".

By now, there's plenty of works, up to and including direct neural interfaces. Utah arrays, Michigan arrays. Stab the brain, dump the spike trains, decode. You crack the manifold open by correlating to known stimuli using ML, and generalize from there to unknown stimuli. There is no need to "know the exact configuration", and few bother - you put your hardware into the part of the brain you want (top level map is consistent enough brain to brain), gather a set of reference points, and use them to anchor the rest of the decoding process.

Why use ML? Because you need a very expressive correlator to bridge the gap between known inputs and the products of whatever transformations the brain subjects them to before they show up in spike trains.

> So what we need to look at are the fundamental possibilities of the structure of LLMs, not how the weights are distributed.

And the fundamental possibilities are... what exactly? We know the I/O planes, we know the possible flow of information, now, what does that give us?

We know enough to prove that a transformer LLM can implement a Turing machine, the same way a CPU can implement a Turing machine. So an LLM is capable of performing arbitrary computation within its capacity. That's it. That's the upper bound.

What follows is: if you can represent "thinking" as a computational process, you can implement it with a Turing machine, and thus, an LLM can be made to think. That proves LLMs can think. But not that the existing ones do or don't! Because that's the entire thing about upper bounds!

We've looked at LLM architecture, and learned basically nothing about whether LLMs think, other than "it's not impossible". That's the actual "fundamental possibilities" we derived from knowing the architecture. One step above worthless. Oh fun.

(If thinking requires hypercomputation, then, nope. LLMs are out. Good luck proving that it does though.)

> Your references that "back that claim", which are in "books you mentioned", which you "mentioned" who knows where. Yeah, no. I'm not walking that chain. If you want to, do it, but for now, I'm filing it as "has no evidence and knows it".

You are free not to believe me and dismiss the whole point. I do have evidence and I know it, no need to prove that (to begin with, the references are there. Read them if you want to expand your knowledge).

> By now, there's plenty of works, up to and including direct neural interfaces. Utah arrays, Michigan arrays. Stab the brain, dump the spike trains, decode. You crack the manifold open by correlating to known stimuli using ML, and generalize from there to unknown stimuli. There is no need to "know the exact configuration", and few bother - you put your hardware into the part of the brain you want (top level map is consistent enough brain to brain), gather a set of reference points, and use them to anchor the rest of the decoding process.

I am familiar with those works. Seeing the stimuli/activation correlation does not imply causal representation of the stimuli. It implies the causal activation of neural structures, at most.

> What follows is: if you can represent "thinking" as a computational process, you can implement it with a Turing machine, and thus, an LLM can be made to think. That proves LLMs can think. But not that the existing ones do or don't! Because that's the entire thing about upper bounds!

Alas! assumption spotted. IF you can represent "thinking" as a computational process, then you could implement a thinking machine. You need to prove first that thinking _is_ a computational process, _then_ you could go and try to implement such machine, and because you proved that thinking is a computational process, you are certain that theoretically such a machine can be built. But until you prove your assumption right, you are just trying blindfolded. The harm in the actual field/society regarding AI is that _we don't even know if thinking can be modeled as a computational process_. And no, this does not have anything to do with science. (By the way, I would not regard AI research as science since it is strictly studying an engineered artifact, but that's another story).

Knowing what exact algorithm "thinking" is isn't a requirement. Automata class is enough to say "a Turing machine can implement it".

There are exactly two possibilities: thinking can be expressed as computation, or thinking requires hypercomputation.

I did acknowledge both, explicitly.

Which one?

I'm betting hard against the second one, by the way. Because it requires hypercomputational magic fairy dust to:

1) exist - physical Church-Turing thesis has to be proven wrong empirically

2) be so involved in the functioning of human brain that it cannot be substituted for anything else

Wishful thinking, in my eyes.

But that's the name of the game, isn't it? Anything but admitting that your mind is a glorified math construct implemented in wet meat.

> Knowing what exact algorithm "thinking" is isn't a requirement. Automata class is enough to say "a Turing machine can implement it".

I don't know what you are referring to by the word 'thinking'. But in any case, if you declare that it is not necessary to know the algorithm about thinking, how can you say then that a Turing machine can implement it? How can you say you implemented something you don't know how it works and how it is constituted? The only option I see then is that you implement something that is phenomenically identical to human intelligence, provided that you exhaust all possible combinations of human intelligence phenomena in a descriptive, extensional way (which, if you assume a finite extension of such phenomena, in any case, and most probably, gets you in the trouble of counting uncountable finite sets).

> There are exactly two possibilities: thinking can be expressed as computation, or thinking requires hypercomputation.

Again, if you do not define what 'thinking' is and how and on what assumptions it can be described as a computational process, this claim is empty.

So as far as I see it, you are still trapped by the assumption that the brain or mind are fundamentally similar to the kind of machines we can build.

> But that's the name of the game, isn't it? Anything but admitting that your mind is a glorified math construct implemented in wet meat.

Again here some assumptions operate, that tell you that the brain is some kind of hardware. And again: there is no real evidence that the body/consciousness 'construct' has any relation or analogy to the hardware/software/machine idea. Quite the contrary. Since the science that occupies itself on these topics is on the very frontier of knowledge and experimentation, reading science literature only will not clarify your thoughts. You will need additional guidance, and that guidance is called philosophy.

I recognize that the references I posted in my original comment are hard to read. But that's the point with the AI/mind debate: it is a tough, bitter topic. Just reading AI research won't bring anyone to the level this research space needs in order to discuss these topics.