Y
Hacker News
new
|
ask
|
show
|
jobs
by
hughw
1 day ago
Just this morning I tweaked my single 3090 setup too:
OLLAMA_FLASH_ATTENTION=1 OLLAMA_KV_CACHE_TYPE=q8_0 OLLAMA_CONTEXT_LENGTH=180000
and that fits in 23GB.
[edited for format]
1 comments
MaKey
6 hours ago
Friends don't let friends use Ollama:
https://sleepingrobots.com/dreams/stop-using-ollama/
link