|
|
|
|
|
by zurfer
88 days ago
|
|
"In this scaffold, several other models were able to solve the problem as well: Opus 4.6 (max), Gemini 3.1 Pro, and GPT-5.4 (xhigh)." I find that very surprising. This problem seems out of reach 3 months ago but now the 3 frontier models are able to solve it. Is everybody distilling each others models? Companies sell the same data and RL environment to all big labs? Anybody more involved can share some rumors? :P I do believe that AI can solve hard problems, but that progress is so distributed in a narrow domain makes me a bit suspicious somehow that there is a hidden factor. Like did some "data worker" solve a problem like that and it's now in the training data? |
|
A lot of this is probably just throwing roughly equal amounts of compute at continuous RLVR training. I'm not convinced there's any big research breakthrough that separates GPT 5.4 from 5.2. The diff is probably more than just checkpoints but less than neural architecture changes and more towards the former than the latter.
I think it's just easy to underestimate how much impact continuous training+scaling can have on the underlying capabilities.