Hacker News new | ask | show | jobs
1M Tokens/s: Scaling Qwen 3.5 27B on 96 B200 GPUs with vLLM (medium.com)
3 points by m4r1k 79 days ago