| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by spidersouris 111 days ago
	Note that a similar idea had already been suggested by Shen et al. (2025) in Speculative Decoding via Hybrid Drafting and Rollback-Aware Branch Parallelism (https://arxiv.org/abs/2506.01979), but with lower performance.