Y
Hacker News
new
|
ask
|
show
|
jobs
by
petu
64 days ago
> Qwen3.5-27b 8-bit quant 20 to 25 tok/sec
It that with some kind of speculative decoding? Or total throughput for parallel requests?