Weren't they just getting better mostly because they were being scaled up? There's no way to do that once you've exhausted all of the data. Besides progress has slowed down at this point anyway.
Not only. Look at the subject of this thread, GPT-4o mini.
I'm optimistic about synthetic data giving us another big unlock, anyway. The text on the internet is not that reasoning dense. And they have a snapshot of pre-2023 that is fixed and guaranteed not to decay. I don't think one extra year of good quality internet is what will make or break AGI efforts.
The harder bottleneck will be energy. It's relatively doable to go from 1GW to 10GW but the next jump to 100GW becomes insanely difficult.
GPT-3 was 173B parameters and it's very bad compare to much smaller models we have nowadays, the data and the compute play a giant role, also I doubt you would need to train a model further after you have trained it on absolute everything (but we are very far from that).
I'm optimistic about synthetic data giving us another big unlock, anyway. The text on the internet is not that reasoning dense. And they have a snapshot of pre-2023 that is fixed and guaranteed not to decay. I don't think one extra year of good quality internet is what will make or break AGI efforts.
The harder bottleneck will be energy. It's relatively doable to go from 1GW to 10GW but the next jump to 100GW becomes insanely difficult.