| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Dylan16807 252 days ago
	Depends on what you're doing. I'm pretty sure the bandwidth for inference isn't much.

1 comments

Depends, if it's tensor parallel or pipeline parallel. Only PP doesn't pass too much. TP does