| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by born-jre 114 days ago
	i think this matters more for lower batch sizes (local llm and private enterprise deployment where there wont be big user at specific time for big batch size) going from mem Io bottleneck to compute.