Hacker News new | ask | show | jobs
by ml_basics 25 days ago
this will change as inference demand increases (which is happening right now faster than many people expected)
2 comments

At the same time, the training paradigm being scaled, Reinforcement Learning, is significantly less data-efficient than next-token prediction. You basically need to run an agent for minutes (or longer if you want good long-horizon performance), only to give it a binary pass/fail - one bit of information.

Inference compute is definitely scaling fast, but to scale RL, training and R&D compute also needs to scale hard. I don't think it's obvious that inference will overtake R&D/training, unless there's a reputable source that states that.

do you have some ref?