Y
Hacker News
new
|
ask
|
show
|
jobs
by
nicman23
32 days ago
> 260k context
with a single 5090?
1 comments
kgeist
31 days ago
Yep, Gated DeltaNet in Qwen3.6 requires much less VRAM for the KV cache than previous generations. Plus the KV cache is 8-bit.
link
nicman23
31 days ago
is it in llama.cpp?
link