Y
Hacker News
new
|
ask
|
show
|
jobs
by
0xc133
5 hours ago
With yarn and rope scaling arguments for llama.cpp you could run qwen3.6-27B with 1M context… if you have enough memory to store it.