| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by oezi 314 days ago
	One question I was wondering about regarding the open models released by big labs is how much more the could improve with additional training. GPT-OSS has 2.1m hours of training, how much score improvements could we see at double that?

2 comments

ModelForge 314 days ago

I think GPT-4.5 was potentially the original GPT-5 model that was larger and pre-trained on more data. Too bad it was too expensive to deploy at scale so that we never saw the RL-ed version

link

poorman 314 days ago

As we saw with GPT-5 the RL technique of training doesn't scale forever

link

energy123 314 days ago

Unless GPT-5 is 30% cheaper to run than o3. Then it's scaling brilliantly given the small gap between release dates. People are really drawing too many conclusions from too little information.

link

oezi 314 days ago

I meant scaling the base training before RL.

link