Hacker News new | ask | show | jobs
by FanaHOVA 1025 days ago
The fact that the GPUs quantity dropdown cannot go over 1,000 drives home the "GPU poor" point from the SemiAnalysis post. Meta alone has 16,000 GPUs. OpenAI's cluster from 2020 had 10,000 GPUs. If you're serious about foundation models development and research, you have to go work at one of these "GPU rich" companies.
1 comments

Or you can invent better models or discover more efficient ways to train existing ones. You know - do something other than dumb scaling up - like what Hinton (backprop, 1987), Lecun (convnets, 1989), or Vaswani, et al. (transformers, 2017) did.
I love this comment. Very HN. You’re absolutely right, everyone should just try to make paradigm shifts in the field.
The key word here is “try”. And we are not talking about “everyone“, just those who complain they don’t have access to $5k/hr GPU clusters.
"Or better, you can do the same thing that three people managed to do in the entire industry in the last 36 years."

I mean, don't get me wrong, I'm all for improvements in AI efficiency, but maybe there isn't that much low-hanging fruit to pick? Tons of papers get published on transformers optimization techniques and barely any of them seem to stick.

> do something other than dumb scaling up

This is exactly what people told OpenAI 8 years ago and look where we are now.

8 years ago the dumb scaling up was exactly what we needed. 8 years we’ve been riding that train. Don’t you think it’s time to try something new?