| Your whole argument falls apart at > We don't understand how they work, because we didn't build them. They built themselves. We do understand how they work, we did build them.
The mathematical foundation of these models are sound. The statistics behind them are well understood. What we don’t exactly know is which parameters correspond to what results as it’s different across models. We work backwards to see which parts of the network seem to relate to what outcomes. > When they get better at drawing, it isn't because we taught them to draw. When they get better at reasoning, it isn't because the engineers were better philosophers. Isn’t this the exact opposite of reality? They get better at drawing because we improve their datasets, topologies, and their training methods and in doing so, teach them to draw. They get better at reasoning because the engineers and data scientists building training sets do get better at philosophy. They study what reasoning is and apply those learnings to the datasets and training methods. That’s how CoT came about early on. |
We don't understand how they work in the sense that we can't extract the algorithms they're using to accomplish the interesting/valuable "intellectual" labor they're doing. i.e. we cannot take GPT-4 and write human-legible code that faithfully represents the "heavy lifting" GPT-4 does when it writes code (or pick any other task you might ask it to do).
That inability makes it difficult to reliably predict when they'll fail, how to improve them in specific ways, etc.
The only way in which we "understand" them is that we understand the training process which created them (and even that's limited to reproducible open-source models), which is about as accurate as saying that we "understand" human cognition because we know about evolution. In reality, we understand very little about human cognition, certainly not enough to reliably reproduce it in silico or intervene on it without a bunch of very expensive (and failure-prone) trial-and-error.