Hacker News new | ask | show | jobs
by orangecat 851 days ago
And then you can look at 5.8 billion lines of spaghetti code

LLMs don't have anywhere near that much code. The algorithms for training and inference are not that complicated; the "intelligent" behavior is entirely due to the weights.

3 comments

OP clearly means that the weights are spaghetti code, technically they may be data but if they encode all of the actual functionality of the system then they are effectively bytecode which is interpreted by a runtime. You can understand how the runtime works if you care to learn, but you will never understand what's happening below that, nor will anyone else.

Aside from annoying people who want to understand how things work, it also means you can't ever know if you have a fully optimal or correct solution, all you can do is keep throwing money into the training furnace and hope a better solution falls out next time. The whole nature of it gatekeeps out anyone who doesn't have enormous amounts of money to burn.

I can see that, although to me there's a difference between weights and something like bytecode. The weights don't encode any sort of logical operations, they're just numbers that get multiplied and added according to relatively simple algorithms.

Totally agreed that the process of generating and evaluating weights is opaque and not very accessible.

You can simulate any digital circuit by multiplying and adding numbers.
But that's exactly the point. The code you are talking about is more like an interpreter for a virtual machine, which then runs a program made up of billions of numbers that wasn't designed by a human (or any sort of intelligence - you can argue about the end product, but the training process certainly isn't intelligent)
The weights are what's analogous to 5.8 billion lines of spaghetti code, here, when doing inference.