Accelerating LLM Inference with Parallel Draft Models (PARD)

Y	Hacker News new \| ask \| show \| jobs

	Accelerating LLM Inference with Parallel Draft Models (PARD) (amd.com)
	1 points by dhruvdh 437 days ago