Hacker News new | ask | show | jobs
by Der_Einzige 527 days ago
Everyone keeps claiming this but we have zero evidence of any kind of scaling wall what-so-ever. Oh you mean data? Synthetic Data, Agents, and Digitization solve that.
3 comments

I disagree, but I also wasn’t referring to the exhaustion of training materials. I am referring to the fact that exponentially more compute is required to achieve linear gains in performance. At some point, it just won’t be feasible to do $50B training runs, you know?
50B still seems reasonable compared to the revenue of the Big AI companies.
what revenues? If by big AI companies you mean llm service providers (OpenAI, ...), their revenues are far from high or profitable. https://www.cnbc.com/2024/09/27/openai-sees-5-billion-loss-t...

Maybe Nvidia, but they are a chip / hardware maker first. And even for them 50B training run with no exponential gains seems unreasonable.

Better to optimize the architecture / approach first, which also is what most companies are doing now before scaling out.

It's not unusual to make infrastructure investments that will pay off in 30-50 years. I don't see why not an AI model - unless it's not true that we're at the end of scaling.
There were multiple reports confirming that OpenAI's Orion (planned to be GPT-5) yielded unexpectedly weak results.
And not just OpenAI is facing this problem. Anthropic and Google as well.
So Deepseek V3 did nothing to show you how wrong this take is?
And costs $500 million per training run.
There seems to be a affordable scaling wall.