|
|
|
|
|
by mvkel
766 days ago
|
|
Just because you can make something doesn't mean you know why it's made. There are thousands of people around the world trying to reverse engineer what is going on in the billions or trillions of parameters in an LLM. It's a field called "Mechanistic Interpretability." The people who do the work jokingly call it "cursed" because it is so difficult and they have made so little progress so far. Literally nobody can predict before they are released what capabilities new models will have in them. And then, months after a model is released, people discover new abilities in it, such as decent chess playing. They are black boxes. |
|
Also an artefact of how evals have been done on a pass fail basis. So that an LLM that gets 90% of a question right is just as much a failure as one that gets 0% of the question.
So that skills appear to emerge suddenly and surprisingly only due to the flawed way that we are forced to study them. Consider the training regime, and partial success towards a goal, and emergence is far less prevalent. There was a paper on that recently, I'll see if I can find.