Hacker News new | ask | show | jobs
by p1esk 1025 days ago
Or you can invent better models or discover more efficient ways to train existing ones. You know - do something other than dumb scaling up - like what Hinton (backprop, 1987), Lecun (convnets, 1989), or Vaswani, et al. (transformers, 2017) did.
3 comments

I love this comment. Very HN. You’re absolutely right, everyone should just try to make paradigm shifts in the field.
The key word here is “try”. And we are not talking about “everyone“, just those who complain they don’t have access to $5k/hr GPU clusters.
"Or better, you can do the same thing that three people managed to do in the entire industry in the last 36 years."

I mean, don't get me wrong, I'm all for improvements in AI efficiency, but maybe there isn't that much low-hanging fruit to pick? Tons of papers get published on transformers optimization techniques and barely any of them seem to stick.

> do something other than dumb scaling up

This is exactly what people told OpenAI 8 years ago and look where we are now.

8 years ago the dumb scaling up was exactly what we needed. 8 years we’ve been riding that train. Don’t you think it’s time to try something new?