| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ellisv 1181 days ago
	That’d be cool but I don’t think most idle consumer GPUs (6-8GB) would have large enough memory for a single iteration (batch size 1) of modern LLMs. But I’d love to see more federated/distributed learning platforms.

2 comments

mirekrusin 1181 days ago

6GB can store 3 billion parameters, gpt3.5 has 175 billion parameters.

link

whalesalad 1181 days ago

Is it possible to break the model apart? Or does the entire thing need to be architected from the get-go such that an individual GPU can own a portion end to end?

link

PeterisP 1181 days ago

It's possible to break the model apart (I mean, for the larger models it's not that a 8Gb card isn't enough but even a single 80Gb card isn't enough) but that needs a high-speed interconnect (Nvidia pods provide hundreds of Gbps, and use all of that) as you need to exchange those parameters quite often, so you're just as limited by your compute as you are by the interconnect speed.

link