| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by zozbot234 76 days ago
	> These models are dumber and slower than API SoTA models and will always be. Sure but you're paying per-token costs on the SoTA models that are roughly an order of magnitude higher than third-party inference on the locally available models. So when you account for per-token cost, the math skews the other way.