Pipeline-parallel LLM inference across GPUs on separate machines

Y	Hacker News new \| ask \| show \| jobs

	Pipeline-parallel LLM inference across GPUs on separate machines (github.com)
	5 points by ngaut 6 days ago