| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by dvdkon 14 days ago
	As far as I know, speculative decoding still verifies that the proposed tokens are what the "big" model would generate, it just uses the guesses to make that process faster. Setting the probability threshold too low then shouldn't affect correctness, just speed (time will be wasted verifying bad guesses).

1 comments

But won't setting it to accept 100% of the proposed tokens will skip the verification?