|
|
|
|
|
by dlcarrier
5 days ago
|
|
That's a big if. The big commercial models seem to gain far more from pre-processing than they do from size, and you can already run pretty useful models on desktop hardware.+ Check out this video about how DeepMind significantly improved performance: https://youtu.be/Dkqzqw8rxXI They basically ran the LLM tuning through an old-school genetic or annealing style algorithm and trounced what a larger model could do alone. |
|