Hacker News new | ask | show | jobs
by dmarchand90 920 days ago
I kinda wonder if maybe it's at least partially due to openai hitting a kind of hyperparameter lottery. When each experiment costs millions it might be that (aside from good/ unique data) they just have a good set of hyperparameters used in training and it's too expensive for a competitor to find equal or better settings
2 comments

I would be surprised if this is the case. Neural scaling laws are well known and are used by all big industry players to extrapolate experiments.
Are they really "laws" my impression is its all just a bunch of empirical trends.

We cannot know truly how these parameters interact at large scale and also how they interact with each other.

Is it really the case that openai has data that Google doesn't?

Sorry for my ignorance: why does each experiment cost millions?
Because training a model costs millions, so each time you experiment with trying to create a new kind of model it costs millions.
It’s the cost of compute hardware required to train a model of that size