| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by nickpsecurity 373 days ago
	Don't the parallelizing techniques of a 4x build make using them more difficult than a 1x build with no extra parallelism? Couldn't the 32GB 4090 handle more models in their original configurations?

2 comments

ijk 372 days ago

For LLM inference parallel GPUs is mostly fine (you take some performance hit but llama.cpp doesn't care what cards you use and other stuff handles 4 symmetric GPUs just fine). You get more problems when you're doing anything training related, though.

link

zargon 373 days ago

> Don't the parallelizing techniques of a 4x build make using them more difficult than a 1x build with no extra parallelism?

For inference, no. For training, only slightly.

link