|
|
|
|
|
by tsimionescu
623 days ago
|
|
I think by far the biggest advances are related to compute power. The amount of processing needed to run training algorithms on the amounts of data needed for the latest models was just not possible even five years ago, and definitely not ten years ago. I'm sure there are optimizations from the model shape as well, but I don't think that running the best algorithms we have today with hardware from five-ten years ago would have worked in any reasonable amount of time/money. |
|
We have GPT-4 (or at least 3.5) tier performance in these much smaller models now. If we teleported back in time it may have been possible to build