|
|
|
|
|
by famouswaffles
1133 days ago
|
|
>it would seem "things that mimic human high level processes" would be a good candidate. That's the natural intuition yes. But I believe Sutton's point is that this very intuition seems to prove itself wrong in the long term. The way I see it, the problem with the high level is that we don't actually know shit. If we knew so completely what it took to model language or vision in the first place, we wouldn't need deep learning at all. It seems intuitive that trying to bake in some basic grammar rules might speed things up along. Problem with that is that we often end up overfitting the models to those specific rules and constraints, limiting its ability to generalize and learn more complex and underlying patterns and structures in language. Patterns that we don't actually know of. The low level processes result in the high level performance but not vice versa. It's said that the one human neuron is equivalent to a CNN. I wouldn't really call the operations of neurons high level though. |
|
That, of course, requires experimentation. If it's not speeding up scaling (of course this should be done), and it's not mimicking human cognition (Bitter Lesson says no), what do you decide to try? I guess I'm missing what other heuristics there are to use here.
Just looking at the current state of where NLP is going: Prompt engineering and its various 'step-by-step' siblings are all pretty high-level human cognition motivated to me. Shouldn't that go against the bitter lesson as well?
"The Bitter Lesson" feels like an article that was written at a time when the intuitions that went into deep learning have become common-place, and scaling things up get a lot of leverage out of the 'insights' that came before. Once the returns have diminished to a point of saturation, the 'insights' will likely once again be useful, until methods to scale catch up once again, and "The Bitter Lesson 2.0" will be making the rounds.