| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by cma 121 days ago
	When you predict with the small model, the big model can verify as more of a batch and be more similar in speed to processing input tokens, if the predictions are good and it doesn't have to be redone.