| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by westoque 406 days ago
	Must be my ignorance but everytime I see explainers for LLMs similar to the post, it’s hard to believe that AGI is upon us. It just doesn’t feel that “intelligent” but again might just be my ignorance.

4 comments

imtringued 405 days ago

It's never going to be AGI, because we're still stuck in the static weights era.

Just because it is theoretically possible to scale your way through sheer brute force alone using a trillion times the compute doesn't mean that you can't come up with a better compute scaling architecture that uses less energy.

It's the same as having a turing machine with one tape vs multiple tapes. In theory it changes nothing, in practice having even the simplest algorithms be quadratic is a huge drag.

The problem with previous AI approaches is that humans wanted to make use of their domain expertise and ended up anthropomorphizing the ML models, which resulted in them being overtaken by people who invested little in domain expertise and more into compute scaling. The quintessential bitter lesson. With the advent of the bitter lesson, people who don't understand anything at all except the concept "bigger is better" arrived, and they think that they can wring out blood from a stone. The problem they run into is that they are trying to get something out of compute scaling that you can't get out of compute scaling.

What they want to do is satisfy a problem definition using an architecture that is designed to solve a completely different problem definition. The AGI compute scaling crowd wants something that is capable of responding and learning through experience, out of something that is inherently designed and punished to not learn through experience. The key aspect "continual learning" does not rely on domain knowledge. It is a compute scaling paradigm, but it's not the same compute scaling paradigm that static weights represent. You can't bet on donkeys in a horse race and expect to win, but since everyone is bringing donkeys to the race it sure looks like you can.

My personal bet is that we will use self referential matrices and other meta learning strategies. The days of hand tuning learning rates to produce pre-baked weights should be over by the end of the decade.

hliyan 406 days ago

Because LLMs successfully emulate a subset of our brain's functions: memory and imagination (the generative/mixing function). What's missing is our brain's ability to validate the generative output against a model of the environment described by memory and output (the real world), which is built on sensory input. In short, we have a concept of true/false, LLMs don't.

nurettin 406 days ago

LLMs emulate language by following intricate links between tokens. This is not meant to emulate memory or imagination, just transforming a list of tokens into another list of tokens, generating language. And language is a huge part of the intelligence puzzle so it looks smart to people despite being quite mechanical.

A next step could be to create a mind, with a piece that works similar to the paretial lobe to give it a sense of self or temporal existence.

dTal 405 days ago

> it looks smart to people despite being quite mechanical

Note that brains themselves are also "quite mechanical", as is any physical system or piece of software. "Looks smart", in the limit, reduces to "is smart".

nurettin 398 days ago

Brains themselves have a lot more mechanisms to cause emergent behavior what with all the adaptive organic layers so I can't really compare the two 1-1.

throwawaymaths 406 days ago

eh, transformers are universal differentiable layered hash tables. that's incredibly powerful. most logic is just pulling symbols and matching structures with "hash"es.

if intelligence is just reasonable manipulation of logic it's unsurprising that an LLM could be intelligent, what maybe is surprising is that we have ~intelligence without going up a few more orders of magnitude in size, what's possibly more surprising is that training it on the internet got it doing the things it's doing

jlawson 406 days ago

Neurons are pretty simple too.

Any arbitrarily complex system must be made of simpler components, recursively down to arbitrary levels of simplicity. If you zoom in enough everything is dumb.

Zorass 406 days ago

The deeper you break things down, the dumber they seem. But maybe that dumbness is just an illusion of the observer's perspective.

Consciousness isn’t in the neurons themselves—it's in the invisible coordination and tension between them.

guappa 405 days ago

Anything is simple if you approximate it to an adimensional point and ignore all the complexities that make it different from that.

jlawson 405 days ago

No, you misunderstood. I am describing [taking part of the whole] not [simplifying the whole] - is that clearer?

voidspark 406 days ago

Neurons are surprisingly not simple. Vastly more complex than the ultra simplified model in artificial neural networks.

jiggawatts 406 days ago

Most of the complexity is incidental to intelligence. It's mostly just the machinery of keeping the cell alive.

Most everything in biology is a clumsy hack accidentally discovered via evolution, and then optimised to death over aeons.

We can sidestep all that mess and extract just the core algorithm that is actually required for intelligence.

voidspark 406 days ago

That is your assumption and it is wrong.

https://grok.com/share/bGVnYWN5_ab498084-58c4-4345-9140-07b5...

Biological Neuron: Processes information through complex, nonlinear integration of thousands of excitatory and inhibitory inputs across dendritic trees, producing spiking outputs with rich temporal patterns. It adapts dynamically via synaptic plasticity, neuromodulation, and structural changes, operating in a probabilistic, energy-efficient manner within oscillatory networks.

Artificial Neuron: Performs simple, linear summation of weighted inputs, applies a static activation function, and produces a single scalar output. It lacks temporal dynamics, local plasticity, or neuromodulation, operating deterministically with high computational cost and fixed connectivity.

voidspark 406 days ago

This is interesting

https://chatgpt.com/share/68219da9-1e78-8007-b083-8a81bfbea2...

"Dendrites can implement non‑linear sub‑units and even logic‑gate‑like behavior before the soma integrates them, whereas the standard artificial neuron uses a plain weighted sum."

"Neurotransmitter diversity (e.g., glutamate, GABA, dopamine) allows different semantics on each connection. An artificial edge conveys only a signed scalar."

laichzeit0 406 days ago

Neither are most functions, but locally, at a point, a linear approximation works just fine in practice.

voidspark 406 days ago

https://news.ycombinator.com/item?id=43959553