Hacker News new | ask | show | jobs
by eigenket 840 days ago
It was formulated at least twice in different versions. First by Heisenberg with his matrix mechanics and shortly afterwards by Schrödinger in terms of wave-functions.

There was considerable disagreement between the factions of physicists who favoured the different versions which essentially ended when after some considerable theoretical effort (mostly by Dirac) it was shown that the two pictures are exactly equivalent.

Physicists still use whichever formulation is most suitable for whatever problem they're trying to solve, for example if you're analysing the something where you care about a bunch of bound states like the simple harmonic oscillator or the hydrogen atom then the matrix picture tends to be easier to work with.

You are right that wave mechanics was more popular than matrix mechanics because physicists were already very familiar with wave methods.

1 comments

I wonder if there is a wave formulation for LLM's and transformers in general?
This paper [1] models some simple (r) NN as ODEs, and uses ODE tools to train and for inference. It’s a start.

[1] https://arxiv.org/abs/1806.07366

I don't know if this is exactly what you are thinking about, but there are some physicists working to understand what happens in transformers: https://proceedings.neurips.cc/paper_files/paper/2023/file/b...
Is it really true that we don't really understand why transformers work so well?

I mean we obviously understand how they work at a pure mechanical level, and we have this analogy with lookup (keys, queries, values) and "attention," but do we really get it? Can someone explain to me why that design works so much better than lots of other things like RNNs?

Or did we just tinker a lot (a method known as "graduate student descent") guided by mathematical hunches and loose analogies with biological brains until we found something that kinda worked?

It wouldn't be the first time. AFAIK we got the idea of wings from birds and figured out how to fly with them before we had a really solid fluid mechanical understanding of how and why wings work the way they do. We just thought "hmm so birds fly, so lets try stuff that looks a bit like that..."

We really don't have a mathematical theory for large complexity. We are kinda in alchemy stage for this "science".
You can probably write down a differential equation which models them but I doubt such a thing would be particularly interesting.
Perhaps neat to visualize.