Hacker News new | ask | show | jobs
by whalesalad 1180 days ago
Is it possible to break the model apart? Or does the entire thing need to be architected from the get-go such that an individual GPU can own a portion end to end?
1 comments

It's possible to break the model apart (I mean, for the larger models it's not that a 8Gb card isn't enough but even a single 80Gb card isn't enough) but that needs a high-speed interconnect (Nvidia pods provide hundreds of Gbps, and use all of that) as you need to exchange those parameters quite often, so you're just as limited by your compute as you are by the interconnect speed.