| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Havoc 465 days ago
	Apparently for the bigger model checking a token is faster than generating a fresh one. So if they tiny model gets it right you get a tiny speed bump. Can’t say I fully understand it either why it’s faster to check Needs a pretty large difference in size to result in a speedup. 0.5 vs 27b is the only ones I’ve seen a speedbump