Hacker News new | ask | show | jobs
by rdw 332 days ago
The bitter lesson is becoming misunderstood as the world moves on. Unstated yet core to it is that AI researchers were historically attempting to build an understanding of human intelligence. They intended to, piece-by-piece, assemble a human brain and thus be able to explain (and fix) our own biological ones. Much like can be done with physical simulations of knee joints. Of course, you can also use that knowledge to create useful thinking machines, because you understand it well enough to be able to control it. Much like how we have many robotic joints.

So, the bitter lesson is based on a disappointment that you're building intelligence without understanding why it works.

2 comments

Right, like discovering Huygens principle, or interference, integrals/sums of all paths in physics.

It is not because a whole lot of physical phenomena can be explained by a couple of foundational principles, that understanding those core patterns automatically endows one with an understanding of how and why materials refract light and a plethora of other specific effects... effects worth understanding individually, even if still explained in terms of those foundational concepts.

Knowing a complicated set of axioms or postulates endows one to derive theorems from them, but those implied theorem proofs are nonetheless non-trivial, and have a value of their own (even though they can be expressed and expanded into a DAG of applications of those "bitterly minimal" axiomatization.

Once enough patterns are correctly modeled by machines, and given enough time to analyze it, people will eventually discover a better how and why things work (beyond the mere abstract, knowledge that latent parameters were fitted against a loss function).

In some sense deeper understanding has already come for the simpler models like word2vec, where many papers have analyzed and explained relations between word vectors. This too lagged behind the creation and utilization of word vector embeddings.

It is not inconceivable that someday someone observes an analogy between say QKV tensors and triples resulting from graph linearization: think subject, object, predicate; (even though I hate those triples, try modeling a ternary relation like 2+5=7 with SOP-triples, its really only meant to capture "sky - is - blue" associations. A better type of triple would be player-role-act triples, one can then model ternary relations, but one needs to reify the relation)

Similarly, without mathematical training, humans display awareness of the concepts of sets, membership, existence, ... without a formal system. The chatbots display this awareness. It's all vague naive set theory. But how are DNN's modeling set theory? Thats a paper someday.

> you're building intelligence without understanding why it works.

But if we do a good enough job of that, it should then be able to explain to us why it works (after it does some research/science on itself). Yes?

Bit fantastical. We are a general intelligence and we dont understand ourselves
Indeed. But the premise of the objection, was that it is understandable, and a shame that we're not putting such understanding before implementing these systems.

If you're right, and it's essentially impossible to understand (and we still want to advance these technologies) we will have to do so in some degree of ignorance anyway.

It doesn’t have to be impossible to understand for (hypothetical) AGI having as much difficulty in understanding it as we do.
I must not be communicating very well, because everyone is arguing with me about points i'm not trying to make. Sorry for that.