| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by calaphos 414 days ago
	There's still a throughput/latency tradeoff curve, at least for any sort of interactive models. One of the reasons why inference providers sell batch discounts.