Sequoia: Speculative decoding boosting LLM inference by 8-10x

Y	Hacker News new \| ask \| show \| jobs

	Sequoia: Speculative decoding boosting LLM inference by 8-10x (infini-ai-lab.github.io)
	3 points by fgfm 869 days ago