| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by madisonmay 1261 days ago
	Interestingly it sounds like offloading could be made quite efficient in a batch setting if you primarily care about throughput rather than latency. Though I guess for most current LLM applications latency is quite important.