Hacker News new | ask | show | jobs
How continuous batching improves LLM inference throughput 23x (twitter.com)
1 points by george_123 1096 days ago