| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by nacs 263 days ago
	I don't know if it will stay this low but the whole point of v3.2 is to be cheaper to run than <= v3.1. (The inference costs are cheaper for them now as context grows because of the Sparse attention mechanism)