| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by dagss 135 days ago

Isn't talking about "here’s how LLMs actually work" in this context a bit like saying "a human can't be a relevant to X because a brain is only a set of molecules, neurons, synapses"?

Or even "this book won't have any effect on the world because it's only a collection of letters, see here, black ink on paper, that is what is IS, it can't DO anything"...

Saying LLM is a statistical prediction engine of the next token is IMO sort of confusing what it is with the medium it is expressed in/built of.

For instance those small experiments that train a network on addition problems mentioned in a sibling post. The weights end up forming an addition machine. An addition machine is what it is, that is the emergent behavior. The machine learning weights is just the medium it is expressed in.

What's interesting about LLM is such emergent behavior. Yes, it's statistical prediction of likely next tokens, but when training weights for that it might well have a side-effect of wiring up some kind of "intelligence" (for reasonable everyday definitions of the word "intelligence", such as programming as good as a median programmer). We don't really know this yet.

3 comments

ActorNightly 135 days ago

Its pretty clear that the problem of solving AI is software, I don't think anyone would disagree.

But that problem is MUCH MUCH MUCH harder than people make it out to be.

For example, you can reliably train an LLM to produce accurate output of assembly code that can fit into a context window. However, lets say you give it a Terabyte of assembly code - it won't be able to produce correct output as it will run out of context.

You can get around that with agentic frameworks, but all of those right now are manually coded.

So how do you train an LLM to correctly take any length of assembly code and produce the correct result? The only way is to essentially train the structure of the neurons inside of it behave like a computer, but the problem is that you can't do back-propagation with discrete zero and 1 values unless you explicitly code in the architecture for a cpu inside. So obviously, error correction with inputs/outputs is not the way we get to intelligence.

It may be that the answer is pretty much a stochastic search where you spin up x instances of trillion parameter nets and make them operate in environments with some form of genetic algorithm, until you get something that behaves like a Human, and any shortcutting to this is not really possible because of essentially chaotic effects.

,

handoflixue 134 days ago

> For example, you can reliably train an LLM to produce accurate output of assembly code that can fit into a context window. However, lets say you give it a Terabyte of assembly code - it won't be able to produce correct output as it will run out of context.

Fascinating reasoning. Should we conclude that humans are also incapable of intelligence? I don't know any human who can fit a terabyte of assembly into their context window.

seanmcdirmid 134 days ago

Any human who would try to do this is probably a special case. A reasonable person would break it down into sub-problems and create interfaces to glue them back together...a reasonable AI might do that as well.

heeen2 134 days ago

I can tell you from first hand experience that claude+ghidra mcp is very good at understanding firmware, labeling functions, finding buffer overflows, patching in custom functionality

dimitri-vs 134 days ago

On the other hand the average human has a context window of 2.5 petabytes that's streaming inference 24/7 while consuming the energy equivalent of a couple sandwiches per day. Oh and can actually remember things.

handoflixue 134 days ago

Citation desperately needed? Last I checked, humans could not hold the entirety of Wikipedia in working memory, and that's a mere 24 GB. Our GPU might handle "2.5 petabytes" but we're not writing all that to disc - in fact, most people have terrible memory of basically everything they see and do. A one-trick visual-processing pony is hardly proof of intelligence.

rune-dev 134 days ago

I think the idea is that we may not store 2.5 petabytes of facts like wikipedia. But we do store a ton of “data” in the form of innate knowledge, memories, etc.

I don’t think human memory/intelligence maps cleanly to computer terms though.

dan_mctree 135 days ago

>So obviously, error correction with inputs/outputs is not the way we get to intelligence.

This doesn't seem to follow at all let alone obviously? Humans are able to reason through code without having to become a completely discrete computer, but probably can't reason through any length of assembly code, so why is that requirement necessary and how have you shown LLMs can't achieve human levels of competence on this kind of task?

ActorNightly 135 days ago

> but probably can't reason through any length of assembly code

Uh what? You can sit there step by step and execute assembly code, writing things down on a piece of paper and get the correct final result. The limits are things like attention span, which is separate from intelligence.

Human brains operate continuously, with multiple parts being active at once, with weight adjustment done in real time both in the style of backpropagation, and real time updates for things like "memory". How do you train an LLM to behave like that?

dagss 134 days ago

So humans can get pen and paper and sleep and rest, but LLMs can't get files and context resets?

Give the LLM the ability to use a tool that looks up instructions and records instructions from/to files, instead of holding it in context window, and to actively manage its context (write a new context and start fresh), and I think you would find the LLM could probably do it about as reliable as a human?

Context is basically "short term memory". Why do you set the bar higher for LLMs than for humans?

foxglacier 134 days ago

Couldn't you periodically re-train it on what it's already done and use the context window for more short term memory? That's kind of what humans do - we can't learn a huge amount in short time but can accumulate a lot slowly (school, experience).

A major obstacle is that they don't learn from their users, probably because of privacy. But imagine if your context window was shared with other people, and/or all your conversations were used to train it. It would get to know individuals and perhaps treat them differently, or maybe even manipulate how they interact with each other so it becomes like a giant Jeffrey Epstein.

wavemode 135 days ago

You're putting a bunch of words in the parent commenter's mouth, and arguing against a strawman.

In this context, "here’s how LLMs actually work" is what allows someone to have an informed opinion on whether a singularity is coming or not. If you don't understand how they work, then any company trying to sell their AI, or any random person on the Internet, can easily convince you that a singularity is coming without any evidence.

This is separate from directly answering the question "is a singularity coming?"

handoflixue 134 days ago

The problem is, there's two groups:

One says "well, it was built as a bunch of pieces, so it can only do the thing the pieces can do", which is reasonably dismissed by noting that basically the only people predicting current LLM capabilities are the ones who are remarkably worried about a singularity occurring.

The other says "we can evaluate capabilities and notice that LLMs keep gaining new features at an exponential, now bordering into hyperbolic rate", like the OP link. And those people are also fairly worried about the singularity occurring.

So mainly you get people using "here's how LLMs actually work" to argue against the Singularity if-and-only-if they are also the ones arguing that LLMs can't do the things that they can provably do, today, or are otherwise making arguments that also declare humans aren't capable of intelligence / reasoning / etc..

wavemode 134 days ago

False dichotomy. One can believe that LLMs are capable of more than their constituent parts without necessarily believing that their real-world utility is growing at a hyperbolic rate.

handoflixue 134 days ago

Fair - I meant there's two major clusters in the mainstream debate, but like all debates there's obviously a few people off in all sorts of other positions.

esailija 134 days ago

There is more than molecules, neurons and synapses. They are made from lower level stuff that we have no idea about (well, we do in this instance but you get the point). They are just higher level things that are useful to explain and understand some things but don't describe or capture the whole thing. For that you would need to go to lower and lower level and so far it seems they go on infinitely. Currently we are stuck at the quantum level, that doesn't mean it's the final level.

OTOH, an LLM is just a token prediction engine. It fully and completely covers it. There is no lower level secrets hidden in the design nobody understands, because it could not have been created if there was. The fact that the output can be surprising is not evidence of anything, we have always had surprising outputs like funny bugs or unexpected features. Using the word "emergence" for this is just deceitful.

This algorithm has fundamental limitations and they have not been getting better, if you look closely. For instance you could vibe code a C compiler now, but it's 80% there, cute trick but not usable in real world. Just like anything, it cannot be economically vibe coded to 100%. They are not going back and vibe coding the previous simpler projects to 100% with "improved" models. Instead they are just vibe coding something bigger to 80%. This is not an improvement in limitations, it is actually communicating between the lines that the limitations cannot be overcome.

Also, enshittification has not even started yet.

grogenaut 134 days ago

I can bake a cake while having 0 understanding of the chemistry that powers the transformation. One is a pile of wet flour, the other is delicious.

A dog can create a snack by doing a trick. Doesn't mean that there isn't some mechanism going on there that neither of them understand.

beepbooptheory 134 days ago

Whose argument is this supposed to be furthering here? You didn't specify what the wet flour is? What is the point of this contribution?