Hacker News new | ask | show | jobs
by TrainedMonkey 737 days ago
I read this as "we have not built up tools / math to understand neural networks as they are new and exciting" and not as "neural networks are magical and complex and not understandable because we are meddling with something we cannot control".

A good example would be planes - it took a long while to develop mathematical models that could be used to model behavior. Meanwhile practical experimentation developed decent rule of thumb for what worked / did not work.

So I don't think it's fair to say that "we don't" (know how neural networks work), we don't have math / models yet that can explain/model their behavior...

6 comments

Chaotic nonlinear dynamics have been an object of mathematical research for a very long time and we have built up good mathematical tools to work with them, but in spite of that turbulent flow and similar phenomena (brains/LLM's) remain poorly understood.

The problem is that the macro and micro dynamics of complex systems are intimately linked, making for non-stationary non-ergodic behavior that cannot be reduced to a few principles upon which we can build a model or extrapolate a body of knowledge. We simply cannot understand complex systems because they cannot be "reduced". They are what they are, unique and unprincipled in every moment (hey, like people!).

Physicists would probably argue that the system might be understood but that we don’t have the model for it yet.

Many natural phenomena look chaotic at best without a model. Once you have a model things fall into place and everything starts looking orderly.

Maybe it cannot be reduced. But maybe we are just observing the peripherals without understanding the inner workings.

If I can speak in aphorisms,

Creation is downhill, analysis is uphill.

Profound ideas often seem simple once understood.

Well put.

In other words: simplicity is a hallmark of understanding.

"We don't know how X works" literally means "we don't have models yet that can explain X's behavior".

TFA is about making a tiny bit of progress towards such models. Perhaps you should read it.

The analogy to airplanes is not relevant imo. Our lack of understanding behind the physics of an airplane is different from our lack of understanding of what an LLM is doing.

The lack of understanding is so profound for LLMs that we can’t even fully define the thing we don’t understand. What is intelligence? What is understanding?

Understanding the LLM would be akin to understanding the human brain. Which presents a secondary problem. Is it possible for an entity to understand itself holistically in the same way we understand physical processes with mathematical models? Unlikely imo.

I think this project is a pipe dream. At best it will yield another analogy. This is what I mean: We currently understand machine learning through the analogy of a best fit curve. This project will at best just come up with another high level perspective that offers limited understanding.

In fact, I predict that all AI technology into the far future can only be understood through heavy use of extremely high level abstractions. It’s simply not possible for a thing to truly understand itself.

I think you have to make a distinction between transformers and neural networks in general, maybe also between training and inference.

Many/most types of neural network such as CNNs are well understood since there is a simple flow of information. e.g. In a CNN you've got a hierarchy of feature detectors (convolutional layers) with a few linear classifier layers on top. Feature detectors are just learning decision surfaces to isolate features (useful to higher layers), and at inference time the CNN is just detecting these hierarchical features than classifying the image based on combinations of these features. Simple.

Transformers seem qualitatively different in terms of complexity of operation, not least because it seems we still don't even know exactly what they are learning. Sure, they are learning to predict next word, but just like the CNN whose output classification is based on features learnt by earlier layers, the output words predicted by a transformer are based on some sort or world model/derived rules learned by earlier layers of the transformer, which we don't fully understand.

Not only don't we know exactly what transformers are learning internally (although recent interpretability work gives us a glimpse of some of the sorts of things they are learning), but also the way data moves through them is partially learnt rather than proscribed by the architecture. We have attention heads utilizing learnt lookup keys to find data at arbitrary positions in the context, and then able to copy portions of that data to other positions. Attention heads learn to coordinate to work in unison in ways not specified by the architecture, such as the "induction heads" (consecutive attention head pairs) identified by Anthropic that seem to be one of the work horses of how transformers are working and copying data around.

Additionally, there are multiple types of data learnt by a transformer, from declarative knowledge ("facts") that seem to mostly be learnt by the linear layers to the language/thought rules learnt by the attention mechanism that then affect the flow of data through the model, as discussed above.

So, it's not that we don't know how neural networks work (and of course at one level they all work the same - to minimize errors), but more specifically that we don't fully know how transformer-based LLMs work since their operation is a lot more dynamic and data dependent than most other architectures, and the complexity of what they are learning far higher.

"neural networks as they are new"

Yup, ANNs have only been around since the 1950s... Brand spanking new

> we don't have math / models yet that can explain/model their behavior...

So, what you're saying is we don't know how they work yet? It's not that deep.