Hacker News new | ask | show | jobs
by grubbypaw 328 days ago
I was not at all a fan of "The Bitter Lesson versus The Garbage Can", but this misses the same thing that it missed.

The Bitter Lesson is from the perspective of how to spend your entire career. It is correct over the course of a very long time, and bakes in Moore's Law.

The Bitter Lesson is true because general methods capture these assumed hardware gains that specific methods may not. It was never meant for contrasting methods at a specific moment in time. At a specific moment in time you're just describing Explore vs Exploit.

3 comments

Right, and if you spot a job that needs doing and can be done by a specialized model, waving your hands about general purpose scale-leveraging models eventually overtaking specialized models has not historically been a winning approach.

Except in the last year or two, which is why people are citing it a lot :)

Probably because this is how bubbles happen.
I think there might be interesting time scales in between “now” and “my entire career” to which the bitter lesson may or may not apply. As an outsider to ML I have questions about the longevity of any given “context engineering” approach in light of the bitter lesson.
The bitter lesson becomes more true over time, because inductive bias becomes less useful over time. Case in point: PCA/hand engineering -> CNN -> ViT.