| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by woadwarrior01 86 days ago
	> Sorry to shatter your bubble, but this is patently false, LLMs are far more efficient on hardware that simultaneously serves many requests at once. You might want to read this: https://arxiv.org/abs/2502.05317v2