| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by 0xc133 5 hours ago
	With yarn and rope scaling arguments for llama.cpp you could run qwen3.6-27B with 1M context… if you have enough memory to store it.