Hacker News new | ask | show | jobs
by ahartmetz 783 days ago
Well, both can be true if you interpret the "it" as "the secret sauce / competitive advantage". A good architecture is a necessary but not sufficient condition for success, but everybody uses more or less the same currently, so data makes the difference. Until the next improvement in architecture.
2 comments

Or until we run out of data that actually differentiates the models
I do argue that the IT is the architecture. We have pretty much had all the data that these LLMs were trained on for a long time. The game changer was the architecture not the data. Unless of course you are on the code is data camp ;).
Probably the "it" is whatever one model has that other models don't have. When everyone is using the same architecture, then the data makes the difference. If everyone has the same data, then the architecture makes the difference.

It sounds pretty obvious to say that the difference is whatever is different, but isn't that literally what both sides of this argument are saying?

edit: I do think that what the original linked essay is saying is slightly subtler than that, which is that _given_ that everyone is using the same transformer architecture, the exact hyperparameters and fine tuning that is done matters a lot less than the data set does.