Hacker News new | ask | show | jobs
by jprete 535 days ago
The Bitter Lesson claimed that the best approach was to go with more and more data to make the model more and more generally capable, rather than adding human-comprehensible structure to the model. But a lot of LLM applications seem to add missing domain structure until the LLM does what is wanted.
3 comments

The Bitter Lesson states that you can overcome the weakness of your current model by baking priors in (i.e. specific traits about the problem, as is done here), but you will get better long-term results by having the model learn the priors itself.

That seems to have been the case: compare the tricks people had to do with GPT-3 to how Claude Sonnet 3.6 performs today.

The Bitter Lesson pertains to the long term. Even if it holds, it may take decades to be proven correct in this case. Short-term, imparting some human intuition is letting us get more useful results faster than waiting around for "enough" computation/data.
Improving model capability with more and more data is what model developers do, over months. Structure and prompting improvements can be done by the end user, today.