Hacker News new | ask | show | jobs
vLLM v0.6.0: 2.7x Throughput Improvement and 5x Latency Reduction (blog.vllm.ai)
3 points by xmo 651 days ago