Y
Hacker News
new
|
ask
|
show
|
jobs
by
vardalab
10 days ago
better prompt processing like 1.5x+ and more kv but tg most likely lower like 0.8x or so but I am just going by memory for Qwen3.5 without mtp.