| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by chr15m 80 days ago
	Is this something that will show up in Ollama any time soon to increase context size of local models?

1 comments

KV quantization has long been available in llama.cpp

Yes but the optimisation described has not right?