|
|
|
|
|
by rdw
332 days ago
|
|
The bitter lesson is becoming misunderstood as the world moves on. Unstated yet core to it is that AI researchers were historically attempting to build an understanding of human intelligence. They intended to, piece-by-piece, assemble a human brain and thus be able to explain (and fix) our own biological ones. Much like can be done with physical simulations of knee joints. Of course, you can also use that knowledge to create useful thinking machines, because you understand it well enough to be able to control it. Much like how we have many robotic joints. So, the bitter lesson is based on a disappointment that you're building intelligence without understanding why it works. |
|
It is not because a whole lot of physical phenomena can be explained by a couple of foundational principles, that understanding those core patterns automatically endows one with an understanding of how and why materials refract light and a plethora of other specific effects... effects worth understanding individually, even if still explained in terms of those foundational concepts.
Knowing a complicated set of axioms or postulates endows one to derive theorems from them, but those implied theorem proofs are nonetheless non-trivial, and have a value of their own (even though they can be expressed and expanded into a DAG of applications of those "bitterly minimal" axiomatization.
Once enough patterns are correctly modeled by machines, and given enough time to analyze it, people will eventually discover a better how and why things work (beyond the mere abstract, knowledge that latent parameters were fitted against a loss function).
In some sense deeper understanding has already come for the simpler models like word2vec, where many papers have analyzed and explained relations between word vectors. This too lagged behind the creation and utilization of word vector embeddings.
It is not inconceivable that someday someone observes an analogy between say QKV tensors and triples resulting from graph linearization: think subject, object, predicate; (even though I hate those triples, try modeling a ternary relation like 2+5=7 with SOP-triples, its really only meant to capture "sky - is - blue" associations. A better type of triple would be player-role-act triples, one can then model ternary relations, but one needs to reify the relation)
Similarly, without mathematical training, humans display awareness of the concepts of sets, membership, existence, ... without a formal system. The chatbots display this awareness. It's all vague naive set theory. But how are DNN's modeling set theory? Thats a paper someday.