Hacker News new | ask | show | jobs
by petu 64 days ago
> Qwen3.5-27b 8-bit quant 20 to 25 tok/sec

It that with some kind of speculative decoding? Or total throughput for parallel requests?