| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by boltzmann-brain 113 days ago
	> Our implementation is up to 2x faster than optimized speculative decoding baselines and up to 5x faster than autoregressive decoding with open source inference engines what about per-FLOP?